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Preface 


Nuclear thermal hydraulics is the application of thermofluid mechanics within the nuclear indus- 
try. Thermal hydraulic analysis is an important tool in addressing the global challenge to reduce 
the cost of advanced nuclear technologies. An improved predictive capability and understanding 
supports the development, optimisation and safety substantiation of nuclear power plants. 


This document is part of Nuclear Heat Transfer and Passive Cooling: Technical Volumes and Case 
Studies, a set of six technical volumes and four case studies providing information and guidance 
on aspects of nuclear thermal hydraulic analysis. This document set has been delivered by Frazer- 
Nash Consultancy, with support from a number of academic and industrial partners, as part of 
the UK Government Nuclear Innovation Programme: Advanced Reactor Design, funded by the 
Department for Business, Energy and Industrial Strategy (BEIS). 


Each technical volume outlines the technical challenges, latest analysis methods and future direc- 
tion for a specific area of nuclear thermal hydraulics. The case studies illustrate the use of a subset 
of these methods in representative nuclear industry examples. The document set is designed for 
technical users with some prior knowledge of thermofluid mechanics, who wish to know more about 
nuclear thermal hydraulics. 


The work promotes a consistent methodology for thermal hydraulic analysis of single-phase heat 
transfer and passive cooling, to inform the link between academic research and end-user needs, 
and to provide a high-quality, peer-reviewed document set suitable for use across the nuclear 
industry. 


The document set is not intended to be exhaustive or provide a set of standard engineering ‘guide- 
lines’ and it is strongly recommended that nuclear thermal hydraulic analyses are undertaken by 
Suitably Qualified and Experienced Personnel. 


The first edition of this document set has been authored by Frazer-Nash Consultancy, with the 
support of the individuals and organisations noted in each. Please acknowledge these documents 
in any work where they are used: 


Frazer-Nash Consultancy (2021) Nuclear Heat Transfer and Passive Cooling, 
Volume 4: Confidence and Uncertainty. 
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Introduction and Technical Context 


A necessary aspect of nuclear engineering professionalism includes having a questioning attitude. 
In the context of numerical models used to provide predictions across all aspects modelling and 
simulation, this is realised by not simply accepting or assuming that the application of, and outputs 
from, a software tool are correct without questioning a number of aspects. These are relevant even 
for widely accepted and applied tools: 


1. How will the results be used? 


2. Is the simulation ‘right’ and how much can | trust it — how adequate and realistic are the 
inputs and results? 


3. Am | using the tool within its design or its intended and trusted prediction envelope? 


4. Have | made any errors, and how can | check this? What extent of independent review and 
checking is appropriate? 

5. How can | convince others, who will need to use the information to make a potentially costly 
or safety critical decision, of the correctness of the results? Can | clearly communicate the 
extent of, or reasons for, my level of confidence? 


6. How can | help others understand and use these results, whilst ensuring that they are not 
used out of context or inappropriately? Particularly because end users are unlikely to have 
the same specialised domain knowledge and case-specific experience as the originator of 
the results. 


7. How much effort is warranted in considering these aspects, given the intended use of the 
results? 


Pragmatism will be necessary when performing a modelling analysis and forming answers to these 
questions for an industrial purpose. It will usually not be possible to wait until there are sufficiently 
large computational resources, improved models and tools, or high quality experimental data to 
fully resolve all aspects (or to justify their expense) because the analysis results will be required to 
make a decision in a timely manner. 


Alternatively, the decision being made may be sufficiently important that a substantial investment 
in extensive computing and experimental assessment proves to be necessary. This could cause 
increased costs and operational or construction delays. Therefore, the evidence and justification 
for extensive modelling activities relies on knowing, with enough rigour and precision, which as- 
pects are most uncertain, which aspects matter most, and what extent of further evaluation or 
assessment is required. 


Any assessment of Nuclear Thermal Hydraulics (NTH) model calculations with a nuclear safety im- 
plication must be aligned with the expectations of a regulator (Volume 1, Section 2.2). In particular, 
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the application of a graded approach, where the effort and rigour should be proportionate to the 
significance of the decision that an analysis supports. 


This technical volume aims to provide assistance by describing two interrelated themes, aimed at 
assessing the credibility that can be attributed to the results of NTH models: 


Confidence: Good modelling and working practices to establish and communicate confidence’ in 
thermal hydraulic models and their simulation results. This includes applying quality assur- 
ance and review practices, performing Verification and Validation (V&V), applying recognised 
best practices, documenting work, and archiving to allow its easy retrieval. 


Uncertainty: Uncertainty represents an inability to state a definitively accurate and precise result 
for a modelling prediction. Here it involves understanding the types, sources and magnitudes 
of uncertainty in a thermal hydraulic analysis. This enables Uncertainty Quantification (UQ) 
to be applied to metrics of interest for reactor safety or performance to determine the signif- 
icance of inherent variability, gaps in knowledge, or approximations and simplifications in a 
model. It also allows the evaluation of which model inputs or parameters matter most to a 
prediction via Sensitivity Analysis (SA). 


V&V, UQ and SA are not specifically thermal hydraulic or nuclear industry techniques — they are 
large and established disciplines that are applied in many branches of engineering and science. 
Their relevance to NTH, and the scrutiny that they are subject to, is growing, however. Large in- 
creases in the availability of computational resources and advanced methods such as multiphysics 
coupling have enabled increases in model complexity and scope. This has resulted in the wider 
application and greater importance of computational methods within design and safety. 


The concepts that are needed for UQ? are introduced in Section 2. Combining these approaches 
and methods and applying them to practical industrial NTH analysis using system codes and Com- 
putational Fluid Dynamics (CFD) models is described in Section 3, then mathematical methods 
that are applied for UQ are detailed in Section 4. 


Building Confidence in Nuclear Thermal Hydraulic Models 


Using model predictions with confidence requires individuals and organisations to not only per- 
form specific mathematical and computational tasks, but to understand the reasons why they are 
performing them, and to fit their outcomes into a broader structure. A hierarchy of considerations 
is typically involved in establishing and maintaining this confidence, each tier being a subset and 
enabler of the tier above: 


1. Organisational approaches that determine, control, record and communicate what activities 
are needed to support the application of analysis tools. 


2. Activities that are component parts supporting an analysis performed within or contributing 
to the overarching organisational approach. 


3. Mathematical methods and software tools that are applied in performing these activities. 


Here the word ‘confidence’ is used in a wide and qualitative sense; it also has a specific quantitative meaning in relation to 
confidence intervals, as discussed in Section 4.7 

When UQ is discussed in this technical volume, the concepts and methods are generally applicable to SA too, and although 
they are distinct activities, UQ will often be used for brevity to implicitly refer to both. 
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These considerations correspond to Figure 1.1, which is intended to show the generic stages 
where decisions and evaluations are made when embarking on an analysis task for a particular 
purpose, and how they sit within a broader context. An analysis task has a set of requirements that 
define and initiate it, and its output will usually deliver a report that can inform a decision. 


An analysis will sit within the organisational context above it (Tier 1 in the list above) and will 
perform component activities (from ‘Model Design’ to ‘Overall Confidence’, Tier 2) making use of 
methods and tools below (Tier 3). 


The organisational context in which an analysis is made. The analysis can be used to develop 


and substantiate methods, or may draw on established capabilities or sources of validation data. 


Model Verification Syn Overall 
Design > and GA => Validation = > UQandSA = > bce eras 
A eo: h a h a h J 


Component activities do not need to be sequential and linear. Feedback connections are 
possible between all stages where a need for improvement or revision is identified. 


= 


Mathematical and statistical methods, software packages, simulation automation. These may 


be developed or substantiated as part of the analysis, or already established and trusted. 


Figure 1.1: Activities to be performed to support having confidence in the results of an 
analysis task, and the context in which they take place. 


The organisational context includes, for example, established Quality Assurance (QA) and V&V 
procedures and guidance that an organisation has defined to enable analysis methods to be ap- 
plied in a consistent manner. The particular analysis being performed may be to solve a specific 
design or operational problem, or may be part of programme of developing the organisational ap- 
proach. It may also have the purpose of specifically developing or enhancing a piece of numerical 
simulation software (‘codes’) or to produce solution automation methods. Alternatively, these may 
just arise as a valuable by-product, available for use in future analyses. 


Model Design and Construction 


The type of analysis that will be performed, which tools are used, and what type of activities are 
necessary to demonstrate confidence in the outputs depend on its intended use, as defined by its 
requirements®. A substantial number of decisions need to be made in choosing an overall approach 
and designing and constructing an NTH model. These decisions can significantly affect its eventual 
utility and cost: 


« What is the safety or economic risk or importance influenced by how the output will be used? 
Does it inform concept design iterations, or is it evidence for a key safety case claim? 


« What are the quantities of interest? 
Systems engineering, as used to capture and demonstrate compliance with safety and operational requirements is an 
established discipline in itself, with its own methods and tools. Similarly, the creation of safety cases, using methods like 


Probabilistic Safety Analysis (PSA) is its own discipline, and is briefly discussed in Volume 1, Section 2.2. The requirements 
for an analysis are likely to arise from these activities, but details of this process are not covered in this volume. 
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* Do the quantities of interest predicted by the model represent the mean behaviour of a sys- 
tem, or are they the extent (or chance) of the occurrence of an extreme aspect of its be- 
haviour?* 


« Is the method a standard technique or a bespoke and novel model? Is the use of an existing 
technique within the scope of its accepted purpose, or is it being used in a new context? 


« What amount of extrapolation to ‘new’ areas of the parameter space is required? How far 
are new simulations outside the range of previous applications of the model, its established 
validation envelope, or the validity range of its assumptions and submodels? 

* How computationally expensive is the model to run? 

* How robust is the solution process? 

— Will it always either reliably converge, or fail to produce a solution in an informative and 
deterministic way? 
— Alternatively, is it prone to unpredictable numerical divergence, and/or can fail to run 
properly in a non-obvious way? 
The latter type may be hard to avoid for complex models, and require significant manual 
inspection of the solutions and case-by-case intervention in their setup and solution process, 
limiting the amount of automation that can be applied to them. 


« What existing guidance and best practice is available for this type of analysis? Has a search 
of the available literature been conducted? 


« Are there regulatory or contractual requirements that mean that some methods must (or may 
not) be used? 


* Who is performing the analysis and review? How experienced are the personnel? Do they 
have sufficient mentoring and supervision? 


When using numerical models (NTH simulations being one example of many) to demonstrate confi- 
dence that a Nuclear Power Plant (NPP) will remain within safety limits, two categories of approach 
are generally considered. 


Conservative approach: Historically, incomplete knowledge of plant behaviour and limited com- 
puting capability for modelling thermal hydraulic phenomena led to the application of a con- 
servative (and thus bounding) methodology. This used both a conservative (or ‘pessimistic’) 
model/code and conservative values of model parameters to demonstrate a high level of 
confidence that a prediction bounded the worst-case (most onerous) behaviour of a system. 


BEPU approach: More recently a Best Estimate Plus Uncertainty (BEPU) approach has increas- 
ingly been adopted. This uses a more realistic model of thermal hydraulic phenomena and 
more realistic input data with due allowance for uncertainties accounted for when analysing 
the model results. 


The chosen approach can have a large disparity in cost and complexity, and both have a role to 
play at various stages of design or licensing, as determined by using a graded approach (Volume 1, 
Section 2.2). In both cases, steps are necessary to ensure that confidence in the thermal hydraulic 


Another interpretation of this is to ask whether it is the combined worst cases of its aleatory uncertainty that are important? 
(Section 1.2) 
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model is justified and appropriately communicated, and that adequate margin will be maintained to 
the safety limits. 


The distinction of approaches into conservative and BEPU are concepts primarily developed for 
numerical modelling to support safety assessments. However, modelling is also performed for a 
range of design and accident scenario assessments that have a bearing on the operation and 
economic performance of a plant, and the decisions made based on it can have consequences 
for the viability of a technology or development project. Therefore, there are related motivations to 
know how much confidence to place in an assessment, and what steps will be most effective to 
increase that confidence. 


Verification and Quality Assurance 
Verification and quality assurance activities are grouped into three main categories. 


Code Verification provides confidence that the mathematical equations have been implemented 
correctly and that the algorithms used to solve them behave as required in the software tool 
used. 


Solution Verification of analyses is often defined as specifically assessing numerical error in spe- 
cific computational simulations (rather than generically of the tool itself) arising mainly from 
iterative convergence and discretisation errors. In this context, it typically focuses primarily 
on how solution prediction converges as a result of increasing refinement of geometrical 
discretisation (mesh or nodalisation). 


Quality Assurance Verification provides confidence that the model inputs and geometry have 
been correctly set up, that the model has been correctly solved, and that the results are 
extracted, analysed and presented properly. Some of these activities are included under de- 
scriptions of ‘Solution Verification’ in some references, but are often only given brief coverage. 
This ensures, through peer-review and independent inspection, code-to-code comparison, 
checking and approval that the model (including the software via specific Software Qual- 
ity Assurance (SQA)) and its reporting do not contain errors, that they conform to specified 
standards and requirements, and that traceable evidence of this is produced. 


Deciding which of these processes to apply, to what extent and by which methods is affected by 
several considerations: 


¢ What procedures for verification and QA does the organisation instructing the analysis or 
regulator require to be followed, and what standard of record keeping and evidence is re- 
quired? 


* What are the key things to check — where are errors most likely to occur and which are of the 
most consequence? 


« Is the decision relying on the result so significant that an independent (blind) parallel calcu- 
lation need to be performed by a different group? 


* Is additional code verification required, or is the code sufficiently mature? 


¢ Could an analyst be expected to be knowledgeable enough about how a code is built, and 
how its numerics operate to meaningfully assess the correctness of its operation, or does it 
need one of the code developers to participate? 
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« Are there sufficiently experienced analysts, who have been independent of the work, avail- 
able to perform quality assurance verification? 


¢ What are the resource requirements for these activities, and are their availability, time and 
costs allowed for? Verification and QA can represent an additional cost that is a significant 
proportion of the cost of originating an analysis and reporting it. Code developers or suffi- 
ciently experienced verifiers and reviewers also may only exist outside of the organisation 
performing the analysis. 


Validation 


Validation has a number of generally agreed and compatible definitions® — in this technical volume 
it is conceptualised as: 


Validation: Comparison of model predictions against measurements, to provide confidence that 
simulation results represent reality with sufficient accuracy. 


There are questions that should be considered related to the availability, quality of and application 
of experimental or plant data: 


« What data is available to validate a model against? 


¢ Does the data come from from small-scale laboratory tests, large scale ‘test rigs’, or full-scale 
plant operation? 


* Is the available data good enough — is more extensive or better (higher resolution, better 
accuracy) data needed? 


¢ Are the uncertainties in the measured data and test geometry understood and quantified? 


* Does the data cover the region of parameter space of interest for the real application? Is 
the process of scaling the data to the real application well developed and robust, with the 
distortions it introduces well quantified? 


¢ Has a similar model using the same or a similar code been validated against relevant data 
before? What evidence is necessary to provide sufficient reason to trust that the extent of the 
‘similarity’ allows a new model to inherit the confidence from that validation evidence, without 
repeating it? 

¢ Will a part of the data be used to calibrate the model, and so cannot be used for valida- 
tion? If so, which parts of the data should be reserved for blind validation comparisons after 
calibration? 


There are also questions related to the interpretation of comparisons to data: 


* What is the appropriate validation metric, and what does obtaining a certain level of agree- 
ment with it imply about the confidence that can be placed in the model? 


« For example, is the data representative of the most important model outputs, or is it a param- 
eter that is convenient to measure, with some separation from the more important location? 
In this case, are there ways of showing how well that agreeing with this proxy measurement 
relates to agreement at the key location? 


5 The ASME definitions for V&V are discussed Volume 1, Section 4.3, although others are in use. 
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* Does the presence of intrusive instrumentation or test fixtures appreciably change the flow or 
phenomena, such that they should be physically modelled as part of the validation? Similarly, 
is the instrument operating in regions of strong gradients or fluctuations, and so does the 
dynamic or spatial averaging response of the measurement to these quantities need to be 
anticipated and replicated in the validation simulation? 


* Is the data an integral quantity (a whole channel flow rate or mean temperature) that can 
show good agreement, while masking a disagreement with the details (e.g. the profile of 
velocity or temperature across a channel)? Similarly, is an extreme value being well predicted 
(such as a maximum component temperature), but the location where it occurs is either not 
known, or in disagreement? Being aware of the agreement at a level of detail able to test 
physical mechanisms® reduces the chance of a result that agrees well by compensating 
errors (‘being right for the wrong reason’). 

« Is the comparison able to be made at more than one operating condition? It may be possible 
to show good agreement with data for one combination of inputs (either by coincidence or via 
calibrated settings), where the effect of compensating errors or insensitivity to parameters at 
that condition coincide to cancel each other out, but the agreement diverges when moving to 
another operating point. 


Uncertainty Quantification and Sensitivity Analysis 


UQ and SA are related, but distinct, activities used to understand and quantify the implications of 
modelling uncertainties. 


Uncertainty Quantification: Establishing the range of, and likelinood for, values that a simulation 
prediction could cover, which should (with an estimate of confidence) encompass the ‘true’ 
(but unknown) real value that the physical system that has been modelled would exhibit. 


Sensitivity Analysis: Identifying how the uncertainty in a model output can be apportioned to 
different sources of uncertainty in it inputs’, and hence where efforts in terms of validation 
and UQ should to be concentrated. 


In this document, the term ‘sensitivity assessment’ will be used to describe activities in a broader 
sense and will cover less formal sensitivity studies, whereas SA will refer to the more formal and 
rigorous numerical analysis. 


These topics will be discussed in significantly more detail in this volume, but high level questions 
that should be considered at the start of an analysis are: 


* Are the key sources of uncertainty understood? 
« What method should be chosen to assess uncertainty? 


* Is the dominant uncertainty in model inputs, or in the model itself? 


§ In NTH this is often called having ‘CFD grade’ data. 
7 From the definition given by Saltelli et al. (2007) 
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Overall Estimate of Confidence in a Prediction 


All of the component activities outlined previously help form a case to answer the questions posed 
at the start of Section 1. Organisational approaches (examples of which are described in Sec- 
tion 3.1) provide mechanisms and frameworks assisting with, and progressing through, these deci- 
sions and activities. However, for any particular analysis task the choices made about each activity 
depend on the context and case specific requirements, the purpose of the model, and the onward 
use of the simulation results. 


For example, if a conservative approach is adopted, then some SA and UQ activities may be 
required®. If such a conservative model is, for example, already a credible method endorsed by a 
design authority or regulator as suitable to demonstrate compliance with operational limits, then 
additional validation is unlikely to be necessary. 


When a conservative approach is not possible, or is not appropriate for other reasons, then a 
suitable level of UQ is required. This will generically involve: 

* Identifying the phenomena that need to be modelled. 

¢ Identifying the specific output metrics that the model must predict. 

* Identifying relevant sources of uncertainty, and their likely range. 

* Identifying the sensitivity of the output metrics to these sources. 


« Identifying how simplifications or incomplete descriptions of the model physics, and numeri- 
cal errors in its solution contribute. 


¢ Performing model evaluations to establish a probability distribution for each output metric. 


« Assessing the implication of these probability distributions in the context of the problem being 
solved. 


As noted in Figure 1.1, these do not need to be performed as sequential and linear steps. The 
‘output metric’ is a specific property at a known location that is of key significance to the safety or 
performance of a plant or system. This can also be referred to as a ‘quantity of interest’, System 
Response Quantity (SRQ) or ‘figure of merit’. 


Of all the component activities identified, this volume aims to describe UQ in the most depth. 
Section 3 also provides context and sources of information on the other component activities. 


Uncertainty 


The conservative bounding approach to thermal hydraulic assessment is widely applied and can 
have the advantage that the models used are relatively cheap to develop and run. However, in 
some circumstances, they can result in excessive pessimisms that may lead to the prediction of 
an incorrect progression of events or unrealistic timescales of fault evolution. They can also predict 
behaviour so onerous that a safety case could not be made, requiring reassessment and reduc- 
tion of conservatisms. This can cause focus to be misplaced on identifying and addressing safety 


Mainly what is necessary to demonstrate that an approach is actually conservative, which may not be apparent or able to 
be determined from the outset. 
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issues that actually pose little risk. A corollary of this is that complexity in systems and models 
challenges the ability to know a-priori what conservative approximations are necessary, and have 
been shown to not bound the upper uncertainty band of BEPU predictions at all points of some 
transient simulations (Glaeser, 2008). 


This has been acknowledged internationally, and since the late 1980s, there has been a move 
to making evaluations on a BEPU basis (Wilson, 2013). This approach uses a realistic model of 
thermal hydraulic phenomena and realistic input data to produce a best estimate output. Uncertain- 
ties in the inputs and modelling approach must then be accounted for when analysing the model 
results. This has the advantage that it provides a realistic view of how an operational scenario 
can progress, and an appropriate treatment of uncertainties ensures that an adequate margin is 
maintained between an uncertainty bound on the output metric and a safety limit. A realistic model 
has further benefits in that it provides better insight into real plant behaviour, which can inform its 


monitoring, diagnostics and maintenance. 


However, UQ is not necessarily straightforward, and under-estimating the uncertainty could lead to 
a potentially unsafe or misleading result. The relationship between uncertainty, margin and BEPU 
vs conservative assessments is illustrated in Figure 1.2 (Figure 2.6 from Volume 1), and is dis- 
cussed further in a licensing context in IAEA (2003). 


Best Estimate Conservative BEPU Overly 
Conservative 


Safety Limit Safety Limit Safety Limit Safety Limit y 
Assessment 
@ 5 
= » a 
g 5 Assessment g 2. 
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plant performance 2. 
< 


Figure 1.2: Margins and uncertainty in best estimate, conservative, BEPU and overly 
conservative assessments. 


This technical volume provides an introduction to the methods applied and references to current 
international nuclear industry best practice guidance describing how to perform UQ in support of a 
BEPU approach to thermal hydraulic modelling. 


Figure 1.3 illustrates an overview of the aims of UQ. For a given thermal hydraulic model different 
sources of uncertainty exist, including uncertainty in the model inputs and uncertainty in the model 
itself. When the identified sources of uncertainty are accounted for (for example by propagating 
a range of model inputs through the calculation route using range of numerical model options) 
the result is a distribution of the output metric of interest (in the example shown in Figure 1.3, a 
particular temperature). 
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This can be used to determine the best estimate value® of a temperature of interest, Tge as well 
as a higher temperature where there is a calculated probability (with a known, numerical level of 
confidence) that this value, Tconr, will be exceeded. 


Uncertain output 


; Simulation model, he >! 
Governing where each stage H 1 
Uncertain inputs: equations can introduce 


Geometry uncertainties 


Operating scenario za 
Boundary conditions ——>> 5 iscretised 
Initial conditions equations 


Material properties 


Probability 
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Numerical 
solutions 


Temperature 


Figure 1.3: Uncertain inputs propagated through a model to produce a best estimate 
prediction and quantify the uncertainty in the prediction. 


There are two types of uncertainty in any model: 


Aleatory Uncertainty which occurs due to the naturally occurring random variability in physical 
processes and input parameters and therefore cannot easily be reduced. Aleatory uncer- 
tainty is also referred to as physical variability, irreducible uncertainty and Type A uncertainty. 


Epistemic Uncertainty is caused by a lack of or incomplete knowledge about some part of the 
physical system being modelled, due to a simplified modelling approach that approximates 
certain physical phenomena (or couplings between phenomena), or by the use of represen- 
tative data, rather than plant or circumstance specific data. Epistemic uncertainty can be 
reduced through gathering more (relevant) data, by increasing understanding of the physical 
system or increasing the fidelity of the modelling approach. This type of uncertainty is also re- 
ferred to as the state of knowledge uncertainty, subjective uncertainty, reducible uncertainty 
or Type B uncertainty. 


Roy and Oberkampf (2011) give more detailed definitions of these two types of uncertainty. Sepa- 
rating uncertainties into aleatory and epistemic’? provides clarity on which uncertainties could be 
reduced through, for example, improved validation against more measurement data or by increas- 
ing the complexity or specificity of a model. However, gathering the most pertinent data or making 
model improvements may be sufficiently difficult or costly that it cannot be accomplished, and so 
delineating between these two types of uncertainty can be ambiguous. 


The first step in UQ is identifying the sources of uncertainty in the analysis, typically as the uncer- 
tainty in the input data or in how well the model represents reality. Then, depending on the nature 
of uncertainty and numerical model, different methods exist for assessing the uncertainty in the 
outputs of an analysis. 


For example, an uncertain input x may be characterised by a Probability Density Function (PDF), 


o 


In general the best estimate value is the distribution median. In this sketch, the mean, median and mode (most-probable) 
value coincide, because it is represented as a normal (Gaussian) distribution. In general, the distribution may be skewed, 
and these values will not be the same. 

10 To help remember which type is which: alea is the Latin for a game of dice — ‘Alea iacta est’ (‘The die is cast’); epistemology 
is the branch of philosophy concerned with the theory of knowledge. 


10 of 109 


1.3 


Introduction and Technical Context 


p(x). This PDF"! would then be sampled and the chosen values used as the inputs for a simulation 
using the model (propagated) to inform the uncertainty in the output parameter of interest. 


NTH simulations can be computationally expensive to evaluate, and so it may prove necessary 
to use methods which take results from a high-fidelity model to develop an approximate, lower- 
fidelity, surrogate model (also known as a reduced order or meta-model) that is significantly faster 
to evaluate. Surrogate models are a key tool for performing SA and UQ, and the process of creating 
them introduces a degree of approximation, which (ideally) is itself quantifiable, and should be 
accounted for when interpreting its results. They are also used when evaluation speed is necessary, 
such as when the model is embedded as a component in a larger simulation, used in design 
optimisation, or applied to real-time simulation or control, where evaluating the full model would be 
intractable, too time consuming, or too expensive. 


Thermal Hydraulic Phenomena that Contribute Uncertainty 


The fundamental physics and equations describing the thermal hydraulic phenomena that can be 
present in a NPP are generally well understood. However, making predictions of them with high 
accuracy and certainty is challenging for reasons that broadly fall into two categories: 


Complexity in behaviour, with numerous overlapping and interacting processes. 


Variability or unknown values for properties and conditions, with some processes sensitive to 
small differences and details. 


The uncertainty that arises in modelling due to complexity and variability have a number of origins 
relating to physical processes and phenomena, and to their mathematical representations'*. This 
includes the following aspects: 


Spatial Variation and Evolution of Flows: Flows in real systems often occur in regions such 
as large vessels and plena, where the details of the complex three-dimensional flow field must be 
accounted for. However, even when flows are essentially one-dimensional (such as in pipes and 
annuli), they are rarely fully developed over the regions of importance — the effect of entrances, 
bends, valves, junctions, expansions and contractions propagate for significant distances and in- 
teract with downstream components. This can affect heat transfer, pressure losses and mixing 
substantially. Most high quality data and correlations for these processes in pipes and components 
assume fully developed flow (although specific and common situations like bend-bend interactions 
are documented, such as by Miller, 2009). By combining them in a flow network, further uncertainty 
is introduced in addition to the inherent uncertainty in the component correlations. 


When building a model, some knowledge of the upstream details are needed to provide realistic 
inlet and boundary conditions. Even if the propagating effects or the interactions between compo- 
nents are known, including them requires that an upstream representation of a given part of the 
system must also be modelled, with increased time and resources. 


"1 Note that the integral of a PDF is a Cumulative Distribution Function (CDF), which varies from 0 to 1, and gives P(X < x), 
the probability that a random variable X will be less than or equal to x. It is also common to encounter the Complementary 
Cumulative Distribution Function (CCDF), which is P(X > x) = 1— P(X < x), and is sometimes called the ‘survival 
function’. 

12 There are additional numerical solution and user aspects that also contribute uncertainty, as discussed in Section 2.1. 
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Unsteadiness and Instability of Flows: As a corollary to the effects of spatial variation and 
development described above, real flows are rarely in a true steady-state. Plants and equipment 
either undergo transients, contain vibrations and acoustic sources, can have slow cyclic behaviour, 
bifurcations, or can have localised unsteady flow (such as in separated or mixing regions). This 
can also include Fluid-Structure Interaction (FSI), which can modify the flow. 


Correlations developed in idealised laboratory conditions can be based on true steady-state con- 
ditions. However, applying them in an unsteady flow involves an approximation and simplification 
to assume that conditions are quasi-steady (the local flow process adapt quickly to the changing 
driving forces). 


Systems may interact with the variable natural environment and weather, or exhibit long term be- 
haviours that only emerge intermittently. Spontaneous instabilities can also be generated in other- 
wise unchanging conditions (situations involving buoyancy are especially prone to this). Modelling 
methods may not be able to be provided with sufficiently realistic and detailed information, be run 
for long enough to resolve such transients, or have sufficient fidelity to include mechanisms that 
drive subtle oscillatory processes. 


Turbulence: A specific manifestation of unsteadiness is turbulence. As introduced in Volume 
1, Section 4.5.3 and elaborated in Volume 3, Sections 2.2.3 and 2.2.4, turbulence epitomises 
complexity in fluid dynamics. It must usually be modelled with approximations and simplifications, 
inevitably causing the omission of details and generating uncertainty (although there are often 
ways of quantifying the inaccuracy). 


Material Properties: For most solids and fluids to be modelled in an NPP, it is usually possible to 
obtain the relevant thermophysical properties with an acceptable level of accuracy, in their design 
or as-built condition. For more novel materials proposed in new designs, however, applicable data 
may be scarce, and contain significant uncertainty, especially at elevated temperatures. In addition, 
the properties of all materials could be subject to variation in composition and characteristics as 
a result of additives, or as they age, with contamination or exposure to radiation and chemical 
reactions. 


Therefore, the ‘current’ operational properties of materials in a plant, representing the moment 
when a modelling prediction is required, may not be known with the same high confidence, without 
specific tasks to sample, measure and control them. Particular difficulty exists for surface proper- 
ties, such as roughness and emissivity — these depend not only on the material, but also on how a 
component was fabricated, and then on the exposure, corrosion, deposition and ageing that it has 
experienced. These processes are often not well understood or easily modelled, but their influence 
on frictional losses and heat transfer can be significant. 
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Geometry: It is usually possible to obtain the geometry for components that are to be modelled 
from engineering drawings and CAD models. However, these represent the designer’s intentions, 
and the actual realisation of the as-built plant will contain a level of deviation from these due to 
tolerances and manufacturing variability. The components may then experience changes from their 
‘cold’ dimensions when they are at their operating (or long-term aged) conditions due to thermal 
expansion, mechanical loading, or the effects of irradiation. Such deformations and distortions may 
be subtle and inconsequential, or may lead to significant changes in dimensions (such as in bowing 
of fuel assemblies or the shrinkage of graphite) and in the presence of flow paths (such as gaps 
appearing between structural elements, resulting in new routes for bypass flows). 


It is also possible for the presence or omission of small geometrical features to introduce significant 
changes to results. For example, how rounded or sharp the corners of a flow entrance or contrac- 
tion is, and whether small lips or chamfers are present, have been found to introduce significant 
changes in flows for these components. The judgements and choices of the analyst can introduce 
the resulting uncertainty because such details can be retained or ‘defeatured’ (removed) as part 
of the necessary process of developing a simplified computational geometry. Similarly, such ridges 
or steps can be present in a real plant that are not present in a CAD model, either through man- 
ufacturing deviations, protruding seals, wear, deposition, or misalignments caused by assembly or 
reassembly after maintenance. 


Multiphase Flows: When multiple phases are present, the possible complexity of flows contain- 
ing droplets, free surfaces, films, boiling and entrained bubbles is extensive. In pipe flows, this 
includes transitions between stratified, annular, bubbly, slug and churn flow regimes. The resulting 
mass momentum and energy transfers between phases and at walls have been extensively stud- 
ied, and are often encapsulated in parameterised empirical ‘closures’ or correlations. Therefore, 
much of the uncertainty involved in predicting multiphase flows is related to assessing the appro- 
priate choice (and parameter range applicability) for a flow being simulated by a given correlation, 
which itself may have been derived in idealised, isolated conditions, and have its own uncertainty. 


Coupled Multiphysics Effects: The simulation of most physical phenomena is interrelated — 
for example the evaluation of structural integrity depends on pressures and heat transfer as input 
boundary conditions. However, some phenomena are more tightly coupled and have strong and 
fast feedback interactions. FSI, where large surface deflections influence the flow field is one ex- 
ample. Another is nuclear heating - NPP simulations usually include some level of energy source 
either from fission reactions or from decay heat. Determining this heating is an involved assess- 
ment, that may need to be made with imperfect knowledge, and so the uncertainty in the prediction 
or specification of the magnitude and distribution of these heat sources is inherited by NTH assess- 
ments relying on them. This applies when the information transfer is one-directional and uncoupled 
(applied as volume or boundary energy sources to a model), but is exacerbated where feedback 
and coupled effects are present. 


Coupled effects occur, for example where the predicted nuclear heating is subject to feedback 
from temperatures predicted by an NTH model, as a result of neutronic effects (such as Doppler 
broadening) and geometrical changes (such as an expanding core increasing the spacing be- 
tween fuel elements). The uncertainty in the predicted temperature field, and the uncertainty in the 
models for reactivity are closely coupled. There are more extreme examples of coupled effects, 
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such as reactivity-flow instabilities in Boiling Water Reactors (BWRs). These are problematic for a 
modelling-led design and subsidisation process because, unless modelling methods are carefully 
designed to capture these combined effects, they may not predict the presence of some phenom- 
ena. Where such phenomena are predicted, they can require a large number of models to be 
combined (for example, in predicting CRUD induced power shift as demonstrated by Jones et al., 
2018), each of which contributes its own uncertainty. 


The extent of complexity and variability that arises in NTH models means that a level of approx- 
imation, assumption and simplification will always be necessary. Furthermore, expert judgement 
and prioritisation of effort, regardless of the fidelity of the modelling approach, will also need to be 
considered. This volume aims to provide guidance on the concepts, tools and techniques that can 
be used to systematically understand the effects of doing so. 


Why are V&V and UQ Hard for Industrial Users? 


The availability and maturity of formal V&V, UQ and SA methods is not commensurate with the lim- 
ited extent to which they are routinely applied in industrial NTH situations. This has been noted by 
practitioners of BEPU modelling (Ivanov ef a/., 2019). It is, however, unlikely to be simply because 
‘Real V&V and real UQ are a lot of work’ (Weirs and Winokur, 2017), although this is also true. 


One reason may be because SA and UQ methods in particular have formal mathematical foun- 
dations that may seem unapproachable for most practising thermal hydraulics engineers. CFD at 
an academic and implementation level also has such formalism'’. This is, however, not visible 
to most CFD users because the tools of industrial practice abstract and encapsulate these in an 
implementation that makes their presence hidden. Some occasions will require users to explore 
and better understand the more formal details, but the well developed nature of the tools for NTH 
modelling provides little necessity or motivation to question and consider the application of them 
during routine use. The near-consensus around a framework of popular methods'* means that it 
is not necessary for a user to choose from one of several applicable, but incompatible approaches. 
A user's skills and knowledge are then generally transferable between software packages and in- 
dustrial sectors, and can be added to gradually, where the application of new modelling options 
can be learned and tested incrementally. Applying UQ does not have these properties; each anal- 
ysis process requires an investment of time and self-education for non-specialists to implement it 
within its own conceptual structure before any benefits can be realised, and it is harder to transfer 
between or compare approaches. They require some specialisation and a long term commitment 
to the approach within an organisation to apply them effectively and consistently. 


The framework for NTH modelling is created by the physical world (items of plant such as vessels, 
pumps and pipes, which are made of real materials) and its coherent mathematical treatment is en- 
abled by the laws of conservation of mass, momentum and energy, and by dimensionless groups. 
UQ, on the other hand, is an almost entirely mathematical construct, and so does not have this 
unifying underpinning that engineers and analysts rely upon (often without realising the extent to 
which they do). While the tools of probability and statistics are common to and coherent across UQ 


13 Present in, for example, operator splitting for pressure-velocity coupling, considering a mesh as a ‘Sobolev space’, the 
characteristics of discretisation schemes, the properties of linear solver algorithms etc. This is similar in FEA. 
14 The finite volume discretisation and RANS turbulence modelling being the ‘work-horse’ methods. 
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analysis methods, these methods are initiated from different underlying principles and assump- 
tions, and so this commonality is not necessarily unifying from a user’s perspective. Therefore, the 
guidance provided in the ‘comprehensive frameworks’ for V&V and UQ (such as those referenced 
in Section 3.2) appear to end-users as a long way from being a complete and pragmatic guide to 
their application — in many cases they are a starting point and an organising principle only. 


As an example, advanced CFD engineers'® usually require to know a significant quantity of three 
elements to fully exploit the discipline: fluid mechanics; mathematics/numerics and computing/pro- 
gramming. Typically two out of three of these elements are taught in undergraduate courses, and 
the final element needs to be learned either in post-graduate study, or via industrial experience. 
Requiring UQ to be conducted routinely then adds an additional element that should be recog- 
nised. Mechanical engineers, who comprise a sizeable proportion of NTH practitioners, are also 
typically not formally educated, experienced or confident enough in the fundamentals of statistics 
and probability theory to easily recourse to their use. It is likely to be the case that the full range of 
knowledge and skills will not be vested in each individual, and so additional engineers, scientists or 
mathematicians, with more specialised UQ knowledge, will be needed to support and supplement 
analysis. Similarly, to specify and incorporate validation data into an analysis structure requires 
individuals that are more specialised in experimental work. In this way, applying UQ to a CFD or 
NTH analysis can be considered as turning it into a multi-disciplinary activity. 


With this starting point, and size of the topic in mind, this technical volume intends to provide a 
clarity on the available and required thought-processes, attitudes, concepts and options for ap- 
proaching the topic. Where it does provide firm guidance and recommendations for practice, they 
are necessarily in a limited number of well understood cases. Doing ‘real V&V and real UQ’ will be 
involved and case specific. 


15 Meaning users who are skilled and independent, sceptical and questioning, able to fully exploit and customise the code, 
understand the meaning of numerical settings and can choose (and know the limits of) models, discretisation settings etc. 
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Concepts for Uncertainty 


This section describes the concepts and introduces the mathematical tools used to quantify uncer- 
tainty and perform sensitivity analysis. UQ and SA, while not the same, are closely related — they 
share concepts and methods and can be performed simultaneously in some cases. UQ addresses 
the statistical description of input uncertainties and their propagation through a model, focussing 
on quantifying the impact of this uncertainty on model outputs. SA on the other hand considers 
which elements of the uncertainty contribute the most to variance in model outputs. Typically, a 
simpler set of sensitivity analysis will be performed early in the NTH model development life-cycle, 
and used to inform UQ. 


The equivalent conceptual and mathematical underpinnings for V&V are not covered; the required 
methods for V&V are generally well documented or referenced alongside specific instances of 
guidance and standards. 


Sources and Treatment of Uncertainty 


When performing UQ or SA on a model of a physical system, the first step is to identify possible 
sources of uncertainty. The process of determining what uncertainties exist may be a significant 
undertaking, and can indicate many more sources than can be practically assessed. 


Common Sources of Uncertainty 


Figure 2.1 shows several types of sources of uncertainty that will commonly need to be considered 
in an NTH model. 


The uncertainty in the geometry, materials properties and status or configuration of the plant or 
equipment being modelled can be grouped together as uncertain model parameters (often re- 
ferred to as parametric uncertainty). Related to this is uncertainty in the initial and boundary 
conditions for the plant, or sub-systems — these could arise because of natural variability in, or 
lack of knowledge about, the flow rates, temperatures, pressures and turbulence at the (user se- 
lected) boundaries of the system being modelled, internal heat generation sources, or heat transfer 
to external or ambient conditions. 


Model uncertainty (also referred to as structural uncertainty) arises because the mathematical 
model implemented in the code is only an approximation of the physics in reality, and often contains 
deliberate idealisations. 
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Figure 2.1: Sources of uncertainty in a thermal hydraulic model. 


Concepts for Uncertainty 


Thermal hydraulic models consist of balance equations for the conservation of energy, mass and 
momentum, as well as models for material properties and closure and constitutive laws. System 
codes rely heavily on empirical correlations (closures) which are models that represent complex 
phenomena by a simplified relationship between transport fluxes of energy, mass or momentum 
and local flow conditions. The application of Reynolds-Averaged Navier-Stokes (RANS) turbulence 
models (which are also closures) in CFD produces a similar source of model uncertainty. 


In addition to being a source of minor inaccuracies, model uncertainty could lead to a model being 
substantially erroneous or inadequate for the desired task. A model may deliberately omit, or be 
incapable of representing a significant phenomenon. This could arise because the presence or 
importance of that phenomenon was not expected and anticipated when the modelling process 
began, or the model may not contain a sufficiently detailed or realistic representation of the physical 
processes necessary to predict it. 


For all models, the user is required to decide which phenomena need to be modelled, determine 
an appropriate representation or simplification of a real geometry, including the overall size of the 
domain, how to include individual parts and what initial boundary conditions to apply. Figure 2.2 
highlights the importance of user uncertainty in the results of NTH calculations. In both cases 
presented, different organisations were asked to perform blind predictions of a benchmark exper- 
iment. The results of these exercises showed significant spread in the predicted metrics between 
the different participants, much of this spread may be associated with user uncertainty. Differences 
in model set-up, including the nodalisation (choosing types of component and their spatial dis- 
cretisation) as well as interpretations of boundary conditions and modelling choices contributed 
to the user uncertainty. User training, best practice guides and this kind of benchmark and vali- 
dation against experimental data can help to reduce this source of uncertainty in model outputs. 
User uncertainty is also linked to the occurrence of unmodelled physical processes, because the 
identification and ranking process in creating, say, a Phenomena Identification and Ranking Table 
(PIRT) (Section 3.1.1) is inevitably subjective, and the extent of the uncertainty caused cannot be 
assessed. 


Scaling uncertainty arises from the fact that model or sub-model validation and correlation de- 
velopment is often performed on reduced scale experiments using substitute materials or obtained 
from operating experience at a different scale. Several aspects of the real system behaviour are 
scaled to make experiments feasible, cheaper to conduct and easier to instrument. For example, 
by using reduced power, size, pressure or temperature, or by using simulant fluids (CSNI, 201 7b). 
The purpose of a scaled experiment is to match the most important dimensionless quantities that 
characterise the real system. However, NPPs contain such a range of phenomena and interactions 
that it is not possible to simultaneously preserve all length scales and time scales, along with the 
flow and heat transfer regimes. This is known as ‘scaling distortion’. It is also possible that the 
real system being modelled operates in regions of the dimensionless number envelope not cov- 
ered by the experiment, or that the experimental geometry is not exactly representative, and so the 
adequacy of the validation evidence for supporting models of the real system must be assessed. 
Thus, there may be a requirement to perform extrapolation outside of the range of experimental 
parameter space, which leads to uncertainty. Furthermore, experimental results contain uncertain- 
ties themselves, in the accuracy or location of their measurements and processed data, but these 
are not always well documented and characterised. 
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Figure 2.2: Examples of user effects in system code calculations. 

a: Comparison of four RELAP5 model calculations of cladding temperature for CSNI 
International Standard Problem number 25 (end of accumulator discharge in a postulated 
LOCA) to experimental data . The wide range in predicted cladding temperature and 
quench times from the different modelling groups are indicative of considerable user 
uncertainty. 

b: Comparison of blind modelling predictions for the Phénix prototype pool-type sodium- 
cooled fast reactor end-of-life tests programme. The spread in predictions of coolant 
temperature at core outlet are indicative of the uncertainty in modelling predictions. 


Numerical uncertainty arises through all of the processes involved in implementing and solving a 
numerical model. In CFD models, poor mesh quality or inadequate geometrical or time resolution 
can contribute. This is in addition to the truncation errors and temporal, spatial and equation’ dis- 
cretisation errors intrinsic to the computational method, as well as incomplete or unstable equation 
solver convergence. All of these lead to potential uncertainties in the model calculation results (that 
are hard to quantify in realistic calculations). 


Some physical systems exhibit complex unstable chaotic or periodic behaviour, and numerical 
methods may or may not be able to correctly capture this. Likewise, some numerical methods 
exhibit unstable solutions which are a result of purely numerical features — telling the difference 
and having confidence that the predicted presence or absence of complex behaviour is ‘real’ is 
possible in some cases, but requires specific effort. 


Numerical software tools® are elaborate, and are generally not completely free of implementation 
errors and defects (‘bugs’) — these can manifest themselves in ways that (subtly or grossly) distort 
the results of a calculation. 


1 Meant in the broad sense, of converting Partial Differential Equations (PDEs) into algebraic linear systems for solution, using 
the finite difference, element or volume method, as well as numerical schemes for evaluating gradients and interpolation. 

2 As well as the compilers and libraries for the code that they are written in, the operating systems that they run on, and the 
hardware that executes them. 
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Treatment of Different Types of Uncertainty 


The way that uncertainties are selected and characterised depends on how UQ will be incorporated 
and treated in the overall prediction and decision making framework, and on the nature of the output 
metric being considered. 


Uncertain inputs to be propagated for UQ can generally be treated by one of three methods: 


Continuous probabilistic treatment: The most common, and most straightforward approach for 
propagating uncertain inputs through a model. Here the uncertain input, x, is represented by 
a PDF and its associated parameters. This describes the likelihood of all different possible 
values of the uncertain input occurring. A continuous probabilistic treatment is typically used 
to describe aleatory uncertain inputs (or epistemic uncertain inputs which can be treated in 
an probabilistic manner). 


Interval treatment: Where an uncertain input is bounded by a known maximum and minimum 
interval range, the uncertainty is defined by the ends of an interval [a, b] which contains the 
uncertain variable x. When propagated through a function or model the uncertain output 
metric is contained within a new interval [c, d]. An interval treatment is commonly used to 
represent epistemic uncertainties. 


Discrete treatment: Uncertain inputs can also be represented by a discrete set of options, each 
with a specific probability of occurrence and such that the sum of all the options is 1. For ex- 
ample, a CFD analysis may be sensitive to the choice of turbulence model, and the modelling 
engineer could choose a most likely turbulence model, but other possible models should be 
considered in the UQ, each with their own probability of occurrence. 


It is normally desirable to represent as many uncertain inputs as possible via continuous probability 
distributions. Incorporating discrete probability options or intervals into the UQ can lead to large 
increases in the number of possible model calculations performed, for example, requiring multiple 
separate sets of samples for each combination of epistemic uncertainties. 


Significance of Output Metrics 


The way different input uncertainties are treated also depends on the nature of the output metric 
of interest. Some modelling predictions require the ‘mean’ behaviour of the system to be charac- 
terised. This requires the epistemic uncertainty to be well defined to give confidence in the output 
metrics. Other types of analysis may be concerned about the ‘worst-case’ behaviour of the sys- 
tem, which may be dominated by the low probability ‘tails’ of uncertain inputs which contribute to 
the aleatory uncertainty. For example: 


¢ Turbulent or unstable flows produce mean loading, or heat transfer on a structure, but the 
details of the spatial and temporal spectrum of the fluctuating or transient part may induce 
a non-linear response in damage or fatigue loads, caused by infrequent combinations of 
fluctuations. 


¢ For a large number of canisters storing hazardous materials, only a small number of them 
may reasonably be expected to fail, but these constitute the main component of the hazard. 
Those which do fail will have been subject to the most onerous loading or conditions, whilst 
also having the least integrity as a result of lower structural material thickness, or from more 


20 of 109 


Concepts for Uncertainty 


Volume 4 


weld defects. The mean behaviour of the canisters may represent no concern, but predict- 
ing a failure probability or rate of the outliers accurately will depend on knowledge of the 
combinations of low probability extremes for several uncertain inputs?. 
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Figure 2.3: Examples of PDFs and their integrals (CDFs). In each situation a, or b, 15 


normal distributions with a range of means and variances (widths) are plotted. Adapted 
from Vose (2008). 


Figure 2.3 shows how the uncertainty in an output metric differs in the scenarios where: 


« The epistemic uncertainty is small and the aleatory uncertainty is large. If the worst-case be- 
haviour of the system is of interest and aleatory variability is large, then this may dominate the 
output metric uncertainty. As a result, detailed investigation of small epistemic uncertainties 
may not be required. 


* The epistemic uncertainty is large and the aleatory uncertainty is small. If the mean be- 
haviour of the system is the key output metric, then it may be possible that a full treatment 


3A specific example of this is discussed by Jiang et al. (2021) for TRISO fuel particles, where failure of only a small number 
within a pebble represents the dominant potential radiological source term, and modern modelling methods are able to be 
applied to assess the statistical likelihood of such failures 
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of the aleatory uncertainty is not necessary. Instead, effort should be focused on epistemic 
uncertainties which impact the average value of the output metric. 


These cases represent clearly distinguishable situations, but often in realistic cases, the encoun- 
tered behaviour will be somewhere in between, and considering a mixture of epistemic and aleatory 
uncertainty will be required. 


Initial Sensitivity Assessment 


For both model parameter uncertainty and uncertainty in the initial and boundary conditions, initial 
sensitivity assessments can be performed to highlight the importance of an uncertain input to the 
problem being modelled. If the simulation result proves to be not sufficiently sensitive to the ex- 
pected (or estimated) range of changes in uncertain inputs, then resources likely do not need to be 
spent propagating their uncertainties through the model. Reducing the number of uncertainties to 
propagate is desirable to reduce the effort expended in the uncertainty analysis. Performing these 
assessments is also helpful in identifying important model characteristics such as the smoothness 
of the response, non-linear trends and discovering how robust the solution process is. As a result, 
often an initial sensitivity assessment will be performed to make a preliminary evaluation of relative 
importance prior to embarking on an extensive set of simulations or detailed UQ. 


Although initial sensitivity assessment is exploratory in nature, the following high level activities 
should be considered during the process: 


* Identify a list of key sensitivity analyses to perform. This is likely to include a degree of 
engineering judgement and the use of organisational approaches such as PIRT (Section 3.1). 


« Define the output metrics to assess the parameter sensitivities against. These are likely to 
be the same output metrics for the full NTH model, but this is not a necessary requirement. 


* Identify a physically reasonable upper and lower bound of each uncertain input parameter to 
perform the initial sensitivity assessment for. 


« Run model calculations for combinations of uncertain inputs. As a minimum, sensitivities 
should be run at the upper bound, lower bound and central values of the input parameter. 
Generally, this will be for the full thermal hydraulic model, but it may be possible to run a 
simplified version of the model or separate calculations to reduce the computational effort. 


* Quantitatively rank the input sensitivities for each output metrics of the model. 


Based on the rankings a judgement can be made on whether or not to include an uncertain input in 
further, more detailed, calculations. These initial sensitivity assessments can thus help determine 
the key uncertain inputs early in the modelling life cycle and optimise how resources are directed. 
For uncertainties not carried forwards the reasons behind this should be documented and justified. 
The methods introduced in Section 4.5 are suitable, although applying only subset of them in an 
approximate/informal way will be appropriate, given the typically rapid and preliminary nature of an 
initial assessment. 
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Common Methods and Tools 


Basic Statistics 


To make use of the concepts, methods and tools described in this technical volume, it is expected 
to be highly beneficial for practising NTH engineers to increase their knowledge of (or refamiliarise 
themselves with) the fundamentals of statistics, or to have access to the advice of suitably spe- 
cialised individuals. This is necessary to avoid misleading oneself or others in the interpretation of 
the outputs from the suggested methods, to create appropriate model inputs, and allow rigorous 
and robust arguments to be made about the statistical significance of results. 


Error propagation: It is expected that some inputs to NTH models will come from experimental 
results, or will be characterised by non-dimensional parameters containing geometry, flow 
and material property characteristics. This is also the case for interpreting experimental data 
for validation comparisons. When applying UQ to both of these tasks, an assessment of 
combined errors or uncertainty in derived quantities is needed. The propagation of errors 
through simple algebraic expressions is introduced in Section 4.2.4 and well documented — 
for example NIST (2021) covers the fundamentals and Ku (1966) and JCGM (2008)* provide 
more detail. 

Probability distroutions: It is common, in the absence of other knowledge, to assume? that data can 
be represented by a normal (Gaussian) distribution. There are large number of distributions 
available with a range of applications and origins, and NIST (2021) describes common ones. 
Each distribution is characterised by ‘moments’ the first four of which are mean, variance®, 
skewness (extent of asymmetry) and kurtosis (how ‘heavy’ or ‘light’ the tails are). 

Statistical hyposthesis testing: Where a confidence interval (Section 4.7) can be assessed for UQ 
or validation purposes, then a statistical test can be applied. This allows the likelihood that 
an observed outcome could be the result of chance to be quantified, and how the certainty 
of statements relates to how much data is needed to assess them. NIST (2021) describes 
these, and the application of the concepts will be described in introductory statistics textbooks 
(e.g. Navidi, 2020). 


Toolkits 


In developing this technical volume, it was concluded that there is insufficient literature available 
that is basic, introductory and explanatory enough to make UQ methods accessible from first prin- 
ciples to the majority of practising or prospective NTH engineers and analysts. To start from the 
fundamentals would require a relatively extensive period of study, or access to a specialist already 
experienced in this domain. Many of the most succinct, accessible and clear available sources of 
information that introduce V&V and UQ are presentations where the topic is explained by exam- 
ple (such as Weirs and Winokur, 2017 or the Dakota Software Training, 2016), not formal papers 
or books. This is a similar position to using CFD, where having access to an experienced prac- 
titioner to ask for their guidance is the best way to learn, and in-lieu of that, or to extend into a 
new capability area, much of the best guidance on practice is in the form of tutorials and training 


4 Available at www.bipm.org/en/publications/guides/#gum, and also referred to as ISO/IEC GUIDE 98-3:2008. 
5 sometimes unconsciously, or without realising that this assumptions is implicit in a concept or method. 
8 standard deviation is the positive square root of the variance 
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materials. 


Therefore, a main recommendation is that the establishment or extension/expansion of a capability 
in UQ should be based on the methods already implemented in available tools’. This is greatly 
facilitated by several tools that are used and relied upon in practice in the nuclear industry being 
open source, such as Dakota’, OpenTURNS? or Uranie'®. Commercial UQ (and more broadly, op- 
timisation and design exploration) tools are available, and more general-purpose numerical tools 
such as MATLAB", Python’? or R'° also contain many of the necessary building-blocks with (of- 
ten open source) libraries providing specialised surrogate model or UQ capabilities (Bouhlel et al., 
2019, Tennge et al., 2018). Their associated documentation, training materials, and user commu- 
nities are likely to be the best method for establishing a UQ capability or adding to its robustness, 
acceptance and transferability/comparability. However, even with this approach, a certain level of 
understanding of statistics will be necessary to interpret the outputs meaningfully. 


These tools typically run as an overarching code, and incorporate Design of Experiments (DoE) 
tools (Section 4.3). They request/initiate the specified evaluations of a separate analysis tool, then 
build surrogate models and/or calculate statistics of the collated outputs. This allows a UQ and SA 
toolbox to be overlaid onto existing analysis tools and models, that users will already be familiar 
with. An additional advantage is that the software that is available for UQ and SA is also usually 
capable of optimisation and model calibration'*, providing a powerful set of methods for design and 
interpreting or exploiting experimental data. 


Visualisation 


Choosing an appropriate visualisation method can significantly aid the interpretation of whether a 
sufficient number of simulations have been performed for a UQ or SA analysis, the adequacy of 
their coverage of the input parameter space, and what the response of a model has been. This 
is especially true in the earlier, exploratory stages of an assessment. It is possible to use purely 
numerical or statistical metrics to interpret UQ or SA, if the behaviour and response is well known, 
but without visualising the data, important features will not be apparent. For example, if an output 
is bimodal (has two peaks or clusters of results, for example caused by a bifurcation of a physical 
process) then the mean value will lie somewhere between them, and represent neither well — 
simply assessing the mean and standard deviation would miss this insight. 


The distribution of outputs matters, therefore, a range of visualisation methods should be employed: 


Scatter plots: The relationship between two variables (inputs or outputs) can be observed as a plot 
of samples. The extent or absence of correlation between them can be seen in the shape 
of the cloud of points, and they can be used to interpret reason for the calculated value of 
a correlation coefficient (Section 4.5). Scatter plots are often shown as a matrix of all inputs 


’ This is perhaps inevitable, particularly if made analogous to an organisation establishing a CFD capability — it would not be 
reasonable or sensible to expect to do anything other than choose an existing code and ‘learn by doing’. 

8 dakota.sandia.gov 

2 openturns.github.io/www 

10 sourceforge.net/projects/uranie 

11 www.mathworks.com 

12 Via NumPy numpy.org and SciPy www.scipy.org 

13 www.r-project.org/ 

14 Table 3.1 defines this and distinguishes between it and other related activities. 
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against all outputs (or every input and output against each other, giving a symmetric square 
matrix) to allow the whole dataset to be surveyed. 

Bar charts: Correlation and sensitivity coefficients as well as Sobol indices are best interpreted as 
how the size of one relates to that for other variables — bar charts are well suited to this. 

Histograms: The distribution of data is indicated by the number of samples found in discrete ‘bins’. 
A continuous version of this is a kernel density plot (Section 4.2.4). 

PDFs and CDFs: Fitting the histogram of data to a known form of probability distribution is a useful 
way to characterise, compare and plot results. The integral of a PDF is the CDF, which 
provides a direct visualisation of what proportion of the data is lower than a given value — this 
allows confidence intervals or likelihoods of occurrence to be observed directly. A CDF can 
also be created numerically as the sum of a histogram or integral of a kernel density function. 

Box and violin plots: A box plot shows the median and interquartile range (from 25% to 75%) of a 
distribution of data, as well as its minimum and maximum range. By combining features of a 
box plot and a kernel density plot, violin plots show all of the details of a distribution, as well 
as the quartiles of the data, and are particularly useful when comparing several datasets. 


Creating several of these plots from the same dataset is convenient with common numerical tools. 
They are also the visualisation methods that are broadly applied in statistics and data science, and 
open source tools are available, for example Plotly'> or seaborn’® in Python. 


Examples of PDFs are shown in Section 4, the use of scatter plots, histograms, bar charts and 
CDFs is demonstrated in Study B, and an example of a violin plot is shown in Figure 2.4"”. 


4 =3 -2 -1 0 1 2 3 4 


Figure 2.4: 1000 random samples drawn from a standard normal distribution visualised 
as a histogram, box plot and violin plot. 


15 plotly.com/graphing-libraries 
16 seaborn.pydata.org, which is based on Matplotlib (matplotlib.org) and described by Waskom (2021) 
17 Adapted from towardsdatascience.com/violin-plot-its-time-to-ditch-the-box- plots-785629b0ff3a 
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Propagation of Uncertainty 


Once the uncertain inputs of interest have been selected, the next step is to propagate'® them 
through the model to understand their impact. The overall aim (summarised in Figure 2.5) is to 
quantify the uncertainty in the model output metric of interest, often in the form of a PDF. 


At the outset of an assessment, it is neces- 
sary to choose a conceptual framework for the 
application of statistical methods. There are 
broadly two methods in statistical analysis — 
frequentist and Bayesian. Section 4.1 provides 
a high level overview of these two approaches 
with respect to the subject of UQ to provide 
context for the topics covered in the remainder 
of the section. 
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uncertainty can be incorporated into a mathe- 


Choose which 
simulations to 
run to perfom 
SA and UQ 
directly 


Choose which 
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matical framework. A mathematical description 
of each uncertain input can then be defined. 
This is discussed in more detail in Section 4.2. 
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To assess the impact of a set of uncertain in- 
puts on an output metric, an NTH model needs 
to be evaluated at a number of points in the pa- 


Build surrogate 
model 


rameter space to capture the underlying model 


response to these uncertainties. The range of — 
Sensitivity 


parameter space and number of evaluations analysis 


required is determined by the characterisation 
of the uncertain inputs and the type of un- 
certainties considered. Defining the points in 
the parameter space at which to evaluate the 
model is known as sampling — a range of meth- 
ods for which are discussed in Section 4.3. 


Uncertainty 
quantification, 
interpretation 
of unceratinty in 
output metrics 


Combine 
different forms 
of uncertainty 


In practice, high-fidelity NTH models are often 
computationally intensive and executing them Figure 2.5: Assessment process 
multiple times to propagate uncertainty can be and topics detailed in Section 4. 


18 The alternative to propagation methods are ‘extrapolation’ methods (CSNI, 2016b, Section 5), which are generally applied 
to the scaling of IET data. Most UQ methods use propagation, and they will be the focus of this volume. It is also possible 
to apply Inverse Uncertainty Quantification (IUQ) where knowledge of the output is used to determine uncertainty on the 
inputs. This technical volume only considers forward propagation, although IUQ is starting to be applied to system code 
assessments (Section 3.3). 
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prohibitively resource intensive. Instead, a surrogate model can be developed, based on the model 
outputs of the high-fidelity model, to allow repeated interrogations of the surrogate model to propa- 
gate the sources of uncertainty. Care needs to be taken when developing these surrogate models, 
because a poor representation of the full-order model could distort or contribute significantly to the 
uncertainty in the model outputs. Section 4.4 discusses the requirements of a surrogate model and 
provides an overview of different types that can be applied. 


Sensitivity analysis draws upon the techniques outlined above to determine which model inputs 
have an important effect on the model outputs of interest. The focus is on identifying the most 
important phenomena, model parameters and initial and boundary conditions, and which variables 
and relationships capture this. It can be performed at various stages in development of a model. 
Section 4.5 describes the process for performing SA and outlines useful metrics that an NTH 
analyst can use to assess the parameters to which an output is most sensitive. 


Section 4.6 draws on the techniques mentioned above to describe how UQ can be performed, Sec- 
tion 4.7 discusses how its outputs can be interpreted, and Section 4.8 describes how to combine 
this with other forms of uncertainty. 


After reading Section 3, it is recommended that Section 4 should be read in its entirety once, with 
the expectation that it will contain a large range of potentially unfamiliar concepts, whose relevance 
and applicability is not immediately obvious. After subsequently beginning to apply UQ tools and 
methods to their own problems, the intention is that readers will recognise the terminology and 
options for methods, and can return to the relevant part of Section 4 for guidance and to obtain the 
suggested references. 
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Organisational Approaches 


Organisational approaches are primarily intended to transparently record and present a common 
view that has been elicited from a wide range of parties, highlighting gaps and facilitating com- 
munication and tracking of progress and status. The most common approaches are introduced in 
Sections 3.1.1 to 3.1.3. They are typically originated by national research bodies, regulators and 
international nuclear industry agencies (Such as the OECD NEA and the IAEA) who commission 
and publish international studies of their use, some examples of which are given in Section 3.1.4, 
describing where they have been applied in practice. 


EMDAP, CSAU and PIRT 


One of the most established approaches is the Evaluation Methodology Development and Applica- 
tion Process (EMDAP), which is described in the United States Nuclear Regulatory Commission’s 
(US NRC) Regulatory Guide 1.203 (US NRC, 2005). It is intended to provide a process, suitable for 
licensing, to develop and assess ‘Evaluation Models’ (numerical tools) used to evaluate accidents 
and transients. It contains 20 steps, grouped into four elements: 


1. Establish Requirements for Evaluation Model Capability. 
2. Develop Assessment Base. 
3. Develop Evaluation Model. 


4. Assess Evaluation Model Adequacy. 


One of its required steps in Element 1 is creating a PIRT, and a UQ process is required in Element 
4. The expected (or default) UQ methodology, typically applied to system code analysis, is Code 
Scaling, Applicability and Uncertainty (CSAU) (US NRC, 1989, Wilson, 2013). The Office for Nu- 
clear Regulation (ONR) also consider CSAU as relevant good practice, and a suitable reference 
method to compare to (ONR, 2019). 


An introduction to PIRT and the steps involved is given in Volume 1, Section 4.2, and while the 
process was developed as part of CSAU, it has grown to become a powerful tool in its own right 
(US NRG, 2005): 


The PIRT should be used to determine the requirements for physical model develop- 
ment, scalability, validation and sensitivities studies. Ultimately, the PIRT is used to 
guide any uncertainty analysis or in the assessment of overall [evaluation model] ad- 
equacy. The PIRT is not an end in itself; rather it is a tool to provide guidance for the 
subsequent steps. 
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It should be kept in mind, that it contains subjective opinions, and is only a part of an overall, itera- 
tive evaluation process. For example, a PIRT can be used to develop Separate Effect Tests (SETs) 
and Integral Effect Test (IET) facilities, which themselves form part of adding to the necessary 
knowledge for an evolving PIRT for an NPP. 


It is also possible for activities like scaling and sensitivity analysis to help to produce quantified 
ranking in a PIRT (Martin, 2011, Yurko and Buongiorno, 2012), rather than using only qualitative 
assessments, such as assigning a low, medium or high importance to an identified phenomenon. 


Predictive Capability Maturity Model 


Another common method is the Predictive Capability Maturity Model (PCMM) (Oberkampf et a/., 
2007), originating from Sandia National Laboratory. It can be applied to help manage risk and avoid 
errors in any Modelling and Simulation (M&S) activity that is used to inform a decision, and asks a 
group of stakeholders to score the maturity of six elements: 


Representation and Geometric Fidelity What features are neglected because of simplifications or 
stylisations? 
Physics and Material Model Fidelity How fundamental are the physics and material models 
and what is the level of model calibration? 
Code Verification Are algorithm deficiencies, software errors, and poor 
Software Quality Engineering (SQE) practices corrupting 
the simulation results? 
Solution Verification Are numerical solution errors and human procedural errors 
corrupting the simulation results? 
Model Validation How carefully is the accuracy of the simulation and 
experimental results assessed at various tiers in a 
validation hierarchy? 
Uncertainty Quantification and How thoroughly are uncertainties and sensitivities 
Sensitivity Analysis characterised and propagated? 


A maturity score of 0 to 3 for an element is based on the suitability of that aspect of a modelling 
capability to play a role with an increasing given level of impact or importance: 

0. Low Consequence, Minimal M&S Impact, e.g. Scoping Studies. 

1. Moderate Consequence, Some M&S Impact, e.g. Design Support. 

2. High-Consequence, High M&S Impact, e.g. Qualification Support. 

3. High-Consequence, Decision-Making Based on M&S, e.g. Qualification or Certification. 


The elements are usually presented as the rows of a table format and the maturity levels as the 
columns. Weirs and Winokur (2017) describe the intent of PCMM as being to: 


* Clarify the expectations of the customer, stakeholders and project team with respect to the 
rigour needed in a computational simulation study. 


« Ensure that a consistent set of technical questions are asked of the team. 


+ Illustrate progress over time, across broad technical fronts. 
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Experience of the use of PCMM in practice brought about an evaluation of the utility and cover- 
age/emphasis of its original formulation, and it has evolved to a ‘fourth generation’ form (Hills et a/., 
2013). As part of its application to the Consortium for Advanced Simulation of Light Water Reactors 
(CASL) programme, comparisons were made between PCMM and CSAU, and it was concluded 
that neither is necessarily considered complete, but are not fundamentally different or incompati- 
ble, and applying a method that combines aspects from both may be a beneficial approach (Rider 
et al., 2013). In a similar way to PIRT, methods are also under development to apply quantitative 
assessments of the maturity (Lin and Dinh, 2020). 


Approaches from Other Industries 


There are also organisational approaches that have been developed in other industries to address 
similar challenges of the increase in importance of modelling and simulation. For example: 


* Quantification of Margins and Uncertainties (QMU) is used in the US nuclear weapons pro- 
gramme (Pilch et a/., 2006, Helton, 2009), and has commonalities with PCMM in key authors 
from Sandia National Laboratory. 


Aerospace is another regulated, high consequence industry that makes extensive use of 
modelling and simulation, and NASA formalised their approach to incorporating it into their 
activities in a standard, arising from the outcome of the investigation into the Columbia acci- 
dent (NASA, 2016). It has an emphasis on applying V&V and UQ with a graded approach, 
by considering the consequence of a decision, and the extent of influence that M&S has on 
it. 


The development of safety cases requires processes to assess the credibility of evidence, 
much of which originates from modelling and simulation, and incorporating it into a structured 
argument. The US NRC have created a generic safety case in the form of a credibility assess- 
ment framework for models of critical boiling transition (US NRC, 2019b). They demonstrate 
the assessment of the evidence and maturity of component parts of the models by applying 
Goal Structuring Notation (GSN) — a method of expressing the structure and interrelationship 
of evidence and arguments, which is used to construct safety cases in a range of industries. 


Examples of PIRT and PCMM 


When applying PIRT and PCMM, there is not a ‘one size fits all’ approach, and their benefits only 
emerge through systematic use. This includes updating them at various stages of a programme, 
so that they can help to determine the most important or efficient steps to take next. Their utility is 
not realised if they are generated once, as an end in themselves at the start of a programme, and 
not revisited and revised. 


The most instructive guidance that can be given in their application is to study some of the ex- 
amples that have been published. All of the references cited below show specific examples of 
the application PIRT or PCMM (or both) to situations covering a broad range of maturity of both 
modelling methods and the reactor technology under consideration. 


* It is possible to apply the most detail and specificity, in terms of phenomena, faults and 
accidents, predictions and validation comparisons, to reactors that are already operating. 
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These range from the original application to Loss-Of-Coolant Accidents (LOCAs) (US NRC, 
1989, Wilson and Boyack, 1998) to re-assessment of the safety features of reactors (Wright 
et al., 2014) and fuel storage (CSNI, 2018) following the Fukushima Daiichi accident. Sim- 
ilarly, detailed knowledge of plant behaviour can be applied to the process of creating and 
substantiating confidence in advanced multiphyics tools (Jones et al., 2018). 


There are also examples for reactor designs that are not in existence, but because there is 
experience with existing reactors using the same technology, significant effort can be applied 
to planning their design and the strategy for their licensing using PIRT. The NGNP VHTR 
concept underwent such an assessment (ORNL, 2008a, ORNL, 2008b) along with its TRISO 
fuel (US NRC, 2004). 


Where reactor technologies are significantly less developed, the bounding faults and acci- 
dents are not necessarily known, and the modelling methods are not as mature. Here the 
PIRT and PCMM process can identify and prioritise fundamental research, modelling method 
development, and identify validation programmes that will be necessary for design and li- 
censing. The US NRC used PCMM to assess the readiness of their codes for licensing all 
types of Gen IV reactors (US NRC, 2020). Some recent examples of where the PIRT pro- 
cess has been used to discover and combine areas of research and development for Molten 
Salt Reactors (MSRs) can be found in Diamond et a/. (2018), Lin et al. (2019) and Singh 
et al. (2019), where the importance of UQ and SA in understanding the best areas to focus 
experimental effort is clear. 


Approaches to Component Activities 


Component activities are the second tier introduced in Section 1.1. The guidance discussed in this 
section is not intended to cover any particular type of assessment or modelling tool', instead the 
intention is to discuss the interactions between the difficulty of the tasks, the extent of available 
guidance and the needs and expectations of industrial analysis, particular in the context of the 
discussion in Section 1.4. 


Table 3.1 is intended to clarify the nature of each component activity, the scope of which can often 
be ambiguous or misunderstood, if not clearly distinguished. Solution verification is defined here 
in relatively narrow terms, and broader activities are needed, so Quality Assurance Verification is 
also defined. Solution verification ought to fulfill a broader remit, but is often not discussed other 
than as relating to the details of numerical error (Roy, 2005), relating mainly to mesh resolution 
convergence. 


1 That information can be found in Sections 2.2, 3.3 and 3.4 
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What is it? 


How do you do it? 


Why do you do it? 


Gets confused with? 


Why is it hard? 


Code 
Verification 


Solution 
Verification 


Quality 
Assurance 
Verification 


Model 
Validation 


Calibration 


Sensitivity 
Analysis 


Uncertainty 
Quantification 


Assessment of errors in 


code solutions and order of 
convergence in response to 


mesh refinement. 


Numerical estimation of 
errors in solutions for 
realistic problems. 


Independent checking of all 


aspects of inputs, 
geometry, meshing, model 
selection and setup, 
solution, post-processing 
and reporting. 


Comparison of model 
results with experimental 
data or observations, or 
with higher-fidelity model 
results. 


Adjusting model 
parameters so that 
simulation results match 
data and observations. 


The systematic or ad-hoc 
study of how strongly input 
parameters or model 
choices affect outputs of 
interest. 


Evaluation of how outputs 


vary with physically realistic 
variability in model inputs or 


originate in the applied 
methods. 


Computing absolute 
errors for simple and 
exact analytical or 
manufactured solutions. 


Mesh refinement studies 
and error estimates for 
user chosen metrics. 


Detailed inspection, 
assessment or 
re-analysis by suitably 
experienced independent 
person. 


Quantitative comparison 
of experimental and 
model results, 
considering uncertainties 
in both. 


Trial and error or 
optimisation and 
statistical approaches. 


Manual or statistical 
studies of inputs and 
their effect on outputs. 


Statistical studies of the 
distribution of outputs 
obtained by propagating 
distributions of input 
quantities. 


To confirm correct 
implementation of algorithms 
in code, and to identify 
algorithmic weaknesses. 


To obtain estimates of the 
numerical uncertainty in 
quantities of interest for 
problems of interest. 


Mistakes, oversights and 
unjustified assumptions in 
data gathering, modelling, 
interpretation and reporting of 


complex models are common. 


To determine whether a 
model in a code is an 
adequate representation of 
reality for an intended 
application. 


Sometimes a necessary 
activity to help ensure that a 
model produces acceptable 
results with some justification. 


To discover and rank the 
importance of inputs, or to 
generate surrogate models. 


Quantifying the potential 
range of values in the 
computed outputs, to 
compare available limits or 
margins. 


SQE, benchmarking, 
code-to-code 
comparison. 


Code verification, 
validation. 


Code or solution 
verification, validation. 


Verification, calibration, 
uncertainty 
quantification. 


Validation, 
benchmarking. 


Uncertainty 
quantification, calibration. 


Sensitivity analysis, 
validation, calibration. 


Good reference solutions are hard to 
find; work is tedious and often 
unappreciated. 


Problems of practical interest are 
often under-resolved and quantities 
of interest are often poorly behaved 
under mesh refinement; work is 
tedious and often unappreciated. 


Necessary to have skilled and 
experienced analysts who have not 
been part of the task to provide 
independence. Requires detailed 
record keeping and robust 
justification of choices. 


Experiments are often limited in 
quality, scope, or number, often have 
incomplete setup description and 
poorly understood errors, and do not 
cover enough of the application 
regime. 


Producing quantifiably ‘good’ results 
with inherently limited physical 
models is not easy. Users have 
complex problems that need 
answers, so they pragmatically ‘do 
their best’. 


The large dimensionality of the input 
space limits the scope of formal 
sensitivity analysis; ad-hoc studies 
risk missing the importance of 
parameters and excluding them 
without good justification. 


Can be very resource intensive to 
examine the variability of all inputs 
and models. Realistic probability 
distributions or ranges of input 
quantities may not be known. 


Table 3.1: Comparison of the nature and scope of component activities, adapted from Weirs and Winokur (2017). 


3.2.1 


Methodologies 


Frameworks and Guidance for V&V and UQ 


An introduction to the agreed definitions of the terms and overall aims of verification, validation and 
uncertainty quantification are given in Volume 1, Section 4.3. To extend from this, the intention here 
is to consider how they can be enacted practically, especially: 


... Where decisions are increasingly based on predictive simulations, V&V is about 
your personal reputation, your credibility as a scientist, and the consequences of your 
analyses. 


Weirs and Winokur, 2017 


This encapsulates the broad point that V&V and UQ and the process of gaining confidence in 
a prediction depends primarily on having an attitude and approach that does not assume that a 
numerical result is ‘right’ or adequate, but to take responsibility to make sure that it is sufficiently 
‘right’ for its purpose, and for the correct reasons. This is a manifestation of safety culture, to have 
a questioning attitude, a rigorous and prudent approach, and necessary communication (IAEA, 
1991). 


This leads to another key point — it is not possible to simply use a piece of software that has been 
previously shown to be adequate or validated, and apply it in a different context, without giving 
thought to the reasons to believe it: 


... the validation of a set of scientific simulation tools is case-specific. A tool cannot be 
said to be validated or qualified in general; however, it can be said to be validated for a 
given application in a given context. 


Roelofs, 2019 


Although even this point is open to challenge — to describe a tool to be ‘validated’ in a particular 
application and context, cannot be an absolute statement, all that it can mean is that sufficient effort 
and process has been applied so that there is no evidence that demonstrates that the predictions 
made? are unacceptably inaccurate for the intended purpose. What process to follow, how much 
evidence needs to be assessed and what level of disagreement would be needed to ‘invalidate’ the 
prediction is a matter to be negotiated between the analyst and the user (or assessor, approver, 
regulator) of the analysis, and potentially corroborated by peer review. A common view on agreed 
frameworks, standards and precedent (in the form of benchmark studies) provide mechanisms to 
facilitate this. 


There are several references that provide a detailed elucidation of the basic terms and concep- 
tual frameworks. The guidance provided by Oberkampf and Roy (2010) (alternatively in Roy and 
Oberkampf, 2011) is applicable to scientific computing generally. The ASME (2009, V&V 20) stan- 
dard is applicable to V&V for CFD and heat transfer specifically, and there are widely cited refer- 
ences by Oberkampf and Trucano (2002), Roache (1997) and Stern ef al. (2001) that also deal 
with V&V and UQ for CFD. 


The topic should not considered to be a settled issue, however, particularly given the gap that exists 


This is similar to the distinction between mathematics and science — in mathematics, ‘proof’ is possible; in science, a theory 
is accepted while it has power to explain and has not been falsified or contradicted by evidence. 
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between these formal and mathematically sophisticated concepts®, and the limited extent of their 
use in common practice (Roache, 2016). In the broader literature available, much of the guidance 
on V&V and UQ actually refers back to or is developed from the underlying ideas described by the 
authors noted above* and ASME, so there is less plurality and independence in the elaboration of 
concepts than might appear. A new standard, ASME V&V 30, is in development that specifically 
deals with ‘Verification and Validation in Computational Simulation of Nuclear System Thermal 
Fluids Behavior’. 


It will rarely be necessary in an industrial context to start a modelling task with a ‘free hand’ or ‘blank 
page’ in these regards because, particularly for QA, nuclear operators, developers and suppliers 
have established quality assurance processes and procedures. These encapsulate many error 
traps and provide significant opportunities for good practice, rigour and traceability, and will need 
to be adhered to when working in these domains regardless of other guidance. 


Obtaining Validation Data 


For large-scale model development and simulation supported licensing programmes, it should 
be expected that design-specific validation experiments will be necessary. Ideally a hierarchical 
method would be employed for complex problems (Oberkampf and Trucano, 2003), where ev- 
idence is established in the individual aspects of the simulation through validation experiments 
designed specifically for the purpose. 


A new or emergent modelling problem will often not have any, or sufficiently complete validation 
data available, especially if it has not been simulated by an organisation previously. There is rarely 
time, budget, facilities or expertise to perform experiments to support a one-off or initial analysis. 
This means that the only option is to look in open literature for the closest applicable experimental 
case. Where suitable data can be found, it can often be too simple, hard to relate to the flow or 
heat transfer regime of interest and has an incomplete description for replication. For analyses that 
will be part of a larger programme of assessment, specific IETs may be commissioned, but even in 
those cases, it can be hard to obtain ‘CFD grade’ data at prototypical conditions. Further discussion 
can be found in Volume 1, Section 4.6. Modelling is also used to justify building (or not building) 
such large test facilities, or to help design them, and so the problem of insufficient validation data 
occurs at this stage too, where it is needed to add confidence to the decisions. 


The suitability for validation purposes of papers that are found in the literature is variable, and their 
shortcomings for this purpose can only emerge after significant effort has been applied to replicat- 
ing them. Experiments that are designed and reported with code validation in mind are substantially 
preferable. As a bridge between development programme-specific experimental facilities, and the 
open literature, benchmarking exercises and databases of simple reference cases exist to test 
and disseminate the capabilities of methods and tools, and are very valuable for validation. The 
available resources and past benchmarking activities depends on the type of modelling tool, and 
specific examples are discussed in Sections 3.3 and 3.4 for system codes and CFD respectively. 


These authors are also not always in agreement (Oberkampf, 2002), so it may be necessary to ‘pick a side’. 

and their co-authors, primarily in US nuclear weapons community, although there is substantial and well-established cross- 
over to nuclear energy. 

Along with ASME V&V 40 for medical devices and others in development for advanced manufacturing, energy systems and 
machine learning. 
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The US NE-KAMS (Nuclear Energy — Knowledge Base for Advanced Modeling and Simulation) 
project was initiated to build a validation database to assemble, interrelate, preserve and exploit 
existing multiphysics and multilevel (in a hierarchy) validation knowledge (details of experimental 
configurations as well as the resulting data) from wide range of sources. Its ambitious objectives 
(Weirs et al., 2011) reflect the aims of the US nuclear V&V community (Oberkampf and Trucano, 
2007). This project would be a highly valuable resource, and a prototype (based on the Gen IV 
Materials Handbook hosted by Oak Ridge National Laboratory, Ren, 2013) was under construction 
in 2017 (Ren, 2016), but more recent information is not readily available in the public domain 
regarding its progress or status. 


An alternative approach to producing data is to apply higher fidelity models either to calibrate or 
supply inputs for lower fidelity codes and also to act as a surrogate for validation data. Examples 
are the derivation or assessment of the pressure drop characteristics for a system code component 
using CFD, or the use of Large Eddy Simulation (LES) or Direct Numerical Simulation (DNS) for 
comparison to or improvements of RANS CFD models (Merzari et a/., 2020). The use of higher 
fidelity modelling results as true validation is not universally accepted, but it is another possible 
item in a modelling engineer’s toolbox. 


Making Comparisons Between Validation Data and Simulations 


Once suitable validation data has been identified, and simulation results intended to replicate it 
have been created, then a comparison is needed between them. This is not as straightforward a 
task as it might appear. 


The most obvious initial approach taken is to plot the experimental data against the simulation 
results. In many cases this approach is the only comparison that is performed, and the assessment 
of the level of agreement and confidence provided is judged based on visual inspection alone, 
without the use of quantitative analysis. This can be a suitable comparison, if the intention of the 
validation is for exploratory (screening or initial decision making) purposes, aiming to determine if 
a simulation can qualitatively reproduce certain features and trends for a phenomenon of interest. 


Visual comparison between validation data and simulation is, however, often disparagingly referred 
to as the ‘viewgraph norm’, because of its lack of numerical assessment. This criticism is heard 
strongly from those engaged in code verification. In that domain, simulation results can and should 
be assessed by comparison to analytical solutions and by the Method of Manufactured Solutions 
(MMS). These should produce numerical simulation results that are nearly identical to the com- 
parison function. The very small remaining differences constitute rigorous error estimates, so how 
they behave under mesh refinement and depend on numerical algorithm implementation and con- 
vergence requires quantitative assessment. However, in validation, such precise quantification is 
not readily achievable. 


Two additional features are needed to strengthen and make a validation data comparison more 
robust: 


Uncertainty assessments: experimental data and simulations inevitably will not be exactly overlaid 
when plotted, but if the uncertainty in both were to be plotted as error bars, then the extent of 
overlap between the estimated error bounds can be evaluated. The simulation results would 
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need to be subject to UQ process that reflected the model and numerical uncertainty, and that 
also propagated as realistic an assessment as possible of the uncertainty in the geometry 
and conditions of the experiment. This means that the requirements for characterising the 
uncertainty both the setup and results of the experiment can be onerous. 

Quantitative validation metrics: the difference between the values of the experimental data and 
simulations can be calculated as an error or difference term, potentially using a statistical 
approach where the PDFs of the uncertainty can be assessed. If a comparison is only being 
made at a small number of locations, then each can be assessed individually. If, however 
there is 1D, 2D or 3D field information, or transiently varying data, then a single evaluation, 
such as a sum of squares error may be appropriate. If the locations or times are not aligned 
between the experiment and simulation, then interpolation will be required. Measurements 
may also be clustered with irregular density (or the results in different regions are of varying 
relative importance) and so it may not be justified to assign a uniform weighting of each point. 
Therefore quantifying the level of agreement may contain substantial difficulties and choices 
in approach. Oberkampf and Barone (2006) review methods of defining metrics, although 
the topic is also not necessarily a settled issue. 


Including uncertainty in comparisons presents an opportunity for perverse or paradoxical motiva- 
tions. If it is judged to be a positive result for validation that the uncertainty bands overlap between 
experiment and simulation, then measures that increase the apparent uncertainties in either or both 
makes this agreement more likely (Coleman and Stern, 1998), which is at odds with a motivation 
to improve accuracy. 


Comparisons of errors and developing metrics is valuable in activities beyond validation, for ex- 
ample, when comparing modelling options (like a choice of turbulence modelling approach) or the 
effect of mesh refinement. Synthesising the performance and agreement into a single value that 
can be ranked or plotted to compare simulations to each other has similar benefits. 


Metrics are a particularly powerful tool if a specific tolerance is required, or when the metric itself 
has meaning that can be used to produce a more objective assessment of the adequacy of a pre- 
diction, and fair means of comparison. However, unless preventative steps are taken, it should be 
recognised that validation assessment can be subject to cognitive biases (such as anchoring or 
confirmation bias, Kahneman, 2011). It is not uncommon for the originator of a simulation, who 
is likely to be invested in its success, to be the one to compare the results to experimental data. 
The selection of which locations to compare, and how to compare them, as well as which mod- 
elling options to use, is subject to the tendency to keep reassessing and tailoring the evaluation 
until good agreement is reached, and to stop ‘digging’ into the detail at this point. It may need a 
conscious level of self-discipline, or assessment by an independent third party, to look closely and 
give sufficient prominence to areas of disagreement. Blind comparisons can avoid these problems, 
but in the case where the simulation is being performed for exploratory reasons, defining the loca- 
tions, nature, purpose and tolerance of the appropriate comparisons in advance may be also be 
misleading. 
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How and When to Use UQ and SA 


Adding to the confidence that can be placed in modelling predictions is an iterative process, which, 
for anything other than one-off or single discipline analysis, would benefit from being controlled, 
documented and communicated within an organisational approach. UQ and SA have a role in this, 
but for practical models in an industrial setting, how and when they are best applied in a specific 
organisation needs to be seen in the context of a typical model development process: 


* A simplified sub-model or ‘toy’ replica of the problem containing the essential functionality 
that will be employed in the modelling software is often a valuable initial step. It provides a 
tractable ‘sandbox’ to test and practice with boundary conditions, material models, modelling 
options, naming conventions, mesh/nodalisation approaches and resolutions. It can be cre- 
ated with a minimal cell/node count, so can be debugged and optimised with rapid solution 
times. Many of the implementation problems that occur in industrial modelling are caused 
by working at scale — models with dozens or hundreds of separate components or bound- 
ary conditions to be defined rarely work first time. The knowledge and confidence gained 
because the intended setup works well on the ‘toy’ allows the problem solving approach to 
separate the task looking for the error that is stopping a solution from working, from the con- 
cern that the approach may not be the correct one, and may never work. It is common when 
trying to debug a large scale model to change many settings in a short period to attempt to 
find the source of a problem. It is easy to then forget to reset some of them once it has been 
resolved, so a minimally invasive and focused approach that the experience from the ‘toy’ 
can provide is valuable in preventing errors and confusion. This is well aligned with the use 
of pilot systems in software engineering, where it is usually beneficial to ‘plan to throw one 
away; you will, anyhow’ (Brooks, 1995) 


¢ Sensitivity analysis (using the sampling and propagation methods that are common to it and 
UQ) can provide numerical results and can be applied at any point in an analysis process, 
but only gives insight to the parameters selected for analysis. Real models contain too many 
inputs to address the effect of each on the results. An initial sensitivity assessment approach 
of screening them will be essential. The aim is to decide what matters — to quickly determine 
how well-known parameters are, and if uncertainties in them are significant in the output 
metric. This usually involves ad-hoc and exploratory work and should emphasise the iden- 
tification of whether expected phenomena can be reproduced. Despite being exploratory, 
omissions or errors in this process can be significant if they mis-characterise the importance 
of an aspect, so documenting the process and decisions, producing meaningful metrics (sim- 
ilar to or the same as those discussed above for validation) and making it readily repeatable 
are advantageous. 


¢ During the various stages of constructing and using a model, the plant design and operating 
conditions may evolve, and more refined information will be available from other modelling 
or design disciplines, and so initial screening assessments may need to be revisited, and 
updated ‘baseline’ results produced that reflect the new condition. Automation of solution 
processes and consistent post-processing of key components is very valuable to allow the 
effect of these changes on the output metrics to be consistently evaluated. 


¢ Despite well developed methods and frameworks being available, attempting to apply formal 
UQ and SA methods rigorously too early is likely to add expense and confusion to real 
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models of complex industrial problems. Such a model needs to achieve a sufficient level of 
maturity and robustness first. While a model remains under-resolved, difficult to converge 
and its predictions still sufficiently approximate when compared to experimental data, then 
these methods cannot provide insight with the certainty they promise, and the effort is not 
justified. It may be the case that models never become good enough to be suitable to apply 
them, or the answers that the model produces are sufficient without applying them. 


* UQ and SA are valuable activities to perform when a model has been developed to the point 
where its outputs are going to be used in a decision or substantiation. UQ allows better 
validation comparisons to be made, BEPU arguments to be made for safety assessment, 
and the size of a margin to be quantified in a conservative calculation if challenged. SA, 
by being able to quantitatively relate how changes in inputs affect output metrics, illustrate 
how well these features need be controlled to maintain a desired margin or performance. 
Guidance on how to interpret the outputs of UQ and SA is given in a number of places in this 
technical volume: 

Section 2.2.3: Visualisation. 

Section 4.5: Sensitivity Analysis. 

Section 4.7: Assessment of Output Uncertainty from Propagation. 
Section 4.8.5: Presenting Overall Uncertainty in an Output Metric. 


This description does not indicate where QA verification should be performed, the level of code 
verification necessary (which will depend on the maturity of the software), and whether a focus of 
effort on solution verification is warranted. 


The integration of validation comparisons also depends on whether the comparisons are being 
performed with the full model, or using existing literature data from a related, but separate config- 
uration. In the latter case, the learning available from performing these validation simulations early 
in the process will aid the model development. This can be done alongside the ‘toy’ model, or in 
some cases it can also serve as the ‘toy’. If experimental results representing the full model config- 
uration are being produced while model is being constructed®, then iterative loops of experiment, 
simulation and comparison will be required, and the amount of UQ necessary at each will need to 
be chosen to suit. 


Methods for System Codes 


System code methods evolved during an extended period of reactor design and construction, and 
with the benefit of large scale experimental facilities able to produce modelling closures and valida- 
tion data. They are the accepted and underpinning technology for reactor analysis and licensing, 
and as such there are mature approaches to producing analysis which commands confidence, 
and there are established UQ methods, all of which are well documented. This is not the case with 
CFD, which will be discussed separately (Section 3.4). 


This history, and their role in the development of BEPU methodology means that the available 
resources for discussing them is extensive, and this section intends mainly to highlight overarching 


® and may be required to provide experimental inputs to calibrate hard-to-characterise features. 
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principles, large scale programmes and example benchmark activities associated with confidence 
and uncertainty. The following books, reports and papers are suggested as a starting point: 


Volume 1, Section 4.6: The CSNI validation matrices for Light Water Reactors (LWRs) are dis- 
cussed, as are the basics of scaling. 


US NRC (2005) and Wilson (2013): The established nature of demonstrating the adequacy of 
system code models is well described, as is the motivation and relevance of understand- 
ing uncertainty, and links to the organisational approaches described in Section 3.1.1. 


IAEA (2002), IAEA (2003) and IAEA (2008): Provide guidance and details for when to use con- 
servative and BEPU methods. To avoid potential confusion between the terms ‘deterministic’ 
and ‘probabilistic’ analysis in these references, it should be noted that the probabilistic aspect 
refers to the likelihood of occurrence of initiating events for accidents or transients and avail- 
ability of plant systems, and a deterministic analysis models a particular set of conditions. A 
deterministic simulation may be conservative or best estimate in nature, and will still make 
use of uncertainty quantification methods (IAEA, 2019). 


D’Auria (2017): Chapter 13 discusses V&V, uncertainty and scaling for system codes, and Chap- 
ter 14 covers how BEPU can be applied. 


Martin and Frepoli (2019): Scaling, V&V and best estimate methods are covered. 


A discussion of the sources of uncertainty in system code thermal hydraulic calculations is given in 
IAEA (2008, Annex I). A key point, particularly in contrast with CFD is the extensive use of and need 
for simplified models and closures/correlations in system codes. They take account of complex 
physics, reduce computational expense, enable high level modelling and are the manifestation of 
where experimental data is captured and exploited. The validity range and implementation details 
of these correlations, as well as uncertainty in their source data, can have a significant impact 
on the uncertainty introduced into the overall calculation. The use of correlations is relevant to 
subchannel codes too (Salko et al., 2016, Jones ef a/., 2018) although these are not discussed in 
detail here. 


Linked to the use of correlations is the issue of scaling. Correlations are typically based on non- 
dimensional characterisations of the particular mass, momentum or energy transport process they 
replicate, and have a range of validity based on how they were derived. Reduced scale test facil- 
ities inevitably lead to distortions of the replication of the geometrical aspect ratios, flow regimes 
and timescales (Dzodzo, 2019), because, compared to a full-sized reactor, they can only cover a 
limited range of these non-dimensional parameters, and cannot align them all simultaneously. This 
leads to errors, uncertainties and limited validation coverage in the system code models which are 
derived from or validated against such scaled tests. The links between scaling and uncertainty are 
discussed in CSNI (2017b) where common uncertainty methods are described, based on either 
the propagation of input uncertainty (as covered in Section 2, such as CSAU or the GRS method, 
Glaeser, 2008) or methods based on accuracy extrapolation (UMAE). These three methods re- 
ceive prominence in much of the international literature, partly because there is commonality in 
the contributing institutions and authors that developed them for several publications, so, like with 
V&V frameworks, the implied level of consensus, and the extent of coverage of a diversity of ap- 
proaches’ may reflect this. 


or evaluation of their consistency with each other or completeness (Pourgol-Mohamad et al/., 2011) 
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Validation is also linked to scaling, where the effect of scaling distortions can be mitigated by using 
experimental data from facilities of different sizes and based on different scaling approaches — 
these are known as a counterpart tests (CSNI, 201 7b). 


The OECD NEA CSNI have organised an ongoing series of BEPU thermal hydraulics modelling 
programmes over more than 25 years: 


« Uncertainty Method Study (UMS). 

¢ Best Estimate Methods Uncertainty and Sensitivity Evaluation (BEMUSE). 

* Post-BEMUSE Reflood Model Input Uncertainty Methods (PREMIUM). 

« Systematic APproach for Input Uncertainty quantification Methodology (SAPIUM). 


The completed phases are described in CSNI (1998), CSNI (2011a) and CSNI (2017a) (or Nouy 
and de Crécy, 2017, Skorek et a/., 2019). The lessons in the first phases led to PREMIUM giving 
increased attention to input uncertainty, and is currently being expanded by SAPIUM to use IUQ 
(Baccou et al., 2020), where the uncertainty in inputs is assessed based on the comparisons 
between experimental data and model predictions. It should be noted that in these studies, user 
effects were observed both in the creation of model predictions, and the application of UQ methods. 


The preceding discussion is focused on the application of system codes to LWRs, where system 
code closures were developed from extensive experimentation and the UQ methods for them can 
take advantage of this experimental underpinning. For new designs of Gen IV reactors, there is less 
experimental evidence, particularly from IETs, although the underlying principles of modelling and 
UQ will be largely applicable. For example, international benchmarking programmes have been 
conducted where a range or organisations modelled tests conducted on two historical Sodium- 
cooled Fast Reactors (SFRs): Phénix (IAEA, 2013) and EBR-II (IAEA, 2017). The same EBR-II 
tests were recently modelled using SAM, with Dakota using Latin Hypercube Sampling (LHS) to 
sample inputs to apply UQ to the prediction (Mui et a/., 2019) 


Methods for CFD 


The application of CFD to nuclear safety is a topic that is under active consideration by interna- 
tional organisations and national regulators. This scrutiny places a requirement to account for and 
demonstrate uncertainties within a CFD analysis with a high level of confidence, with a focus on 
validation comparisons, on the ability of the chosen approach to turbulence to model the phenom- 
ena encountered, and on demonstrating the adequacy of mesh quality and resolution (Downing 
et al., 2018). 


The experience and methods originally developed for system codes in NTH simulations described 
above have been used as some of the starting points for applying UQ to CFD: 


« In addition to publishing CFD best practice guidance (CSNI, 2015) the CSNI have presented 
a CFD specific summary of sources of uncertainties, have contrasted the characteristics of 
uncertainties for (single-phase) CFD with system codes, and provided some examples of the 
application of a range of different methods (CSNI, 2016b). This document serves to highlight 
that UQ in the context of CFD for NTH is currently not a settled issue, because there are few 
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actionable recommendations and the description of the methodologies is not amenable to 
being adopted by readers (reinforcing the position described in Section 1.4). 


¢ CSAU is a UQ method applied to system codes, and Acton and Baglietto (2019) have con- 
sidered what would be needed to apply it to CFD, also providing a summary of the needs and 
differences of the use of CFD compared to system code justifications. Uchida and Kawamura 
(2008) also describe error estimation for CFD simulations in the CSAU framework. 


In its broader industrial application, however, CFD has paid attention to V&V and UQ for several 
decades, driven in many cases by aerodynamics (Walters and Huyse, 2002); some of the CFD 
specific frameworks described in Section 3.2.1 originate in the aerospace CFD domain. This is 
understandable, given that, while the prediction of lift, drag and separation on air-frames is a chal- 
lenging task, it is possible to make very detailed measurements, and tune turbulence models to 
suit. It does not have the range of length scales that can be encountered in NPP simulations, nor 
needs to predict, for example, buoyant heat transfer or two-phase flow. 


The propagation of uncertain inputs through a CFD model can be achieved using the same meth- 
ods as any other calculation, however, CFD model evaluations are almost inevitably computation- 
ally expensive, and so any UQ or SA method applied must anticipate the fact that the number of 
runs will usually be limited to dozens (or hundreds), and so the most benefit must be gained from 
the chosen set. This places emphasis on the optimisation of mesh, modelling choices and early 
sensitivity assessment to select the best setup. 


Model and Numerical Uncertainty in CFD 


Parametric and initial/ooundary condition uncertainty only accounts for some of the overall uncer- 
tainty in a prediction. There are a number of other features that are notable about uncertainties 
and UQ for CFD that create model and numerical uncertainty: 


* The topic of mesh independence and mesh resolution has been extensively studied, and rig- 
orous methods such as Richardson extrapolation or Grid Convergence Index (GCI) (Roache, 
1997, ASME, 2009, V&V 20) are able to quantify its effects. For transient simulations, it is 
equally important to look into time resolution, although the Courant-Friedrichs Lewy (CFL) 
number provides a good guide, and the timestep cannot be varied independently of mesh 
resolution (smaller cells need smaller timesteps to maintain a given CFL). To establish an 
appropriate time step, the timescales or frequency of the modelled phenomena should also 
to be considered, as discussed in Volume 3, Section 3.2. 


* The approach to modelling turbulence, in terms of using RANS (or URANS), LES or hy- 
brid methods, and which model and options to choose within these types will often have a 
determining effect on the accuracy of a simulation. 


¢ The numerical uncertainties introduced by the discretisation of the PDEs of the Navier Stokes 
equations via the finite element or volume method, and the discretisation of spatial and tem- 
poral gradients® can in some situations be estimated, but for complex 3D flows in unstruc- 
tured meshes this is very challenging and not routinely assessed. 


8 Often described as using ‘first order’ or ‘second order’ methods, although this simple distinction belies a large range of 
possible options and their characteristics. 
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* The set of transport equations in CFD codes are always solved by some form of iterative 
algorithm. In ideal cases, the residuals of these equations can be decreased unambiguously 
to steady values below a specified tolerance. However, in the case of complex flows or more 
complex physics (like combustion), where mesh resolution or quality is inadequate, or where 
there is inherent unsteadiness in flow fields, then the residuals (and other output metrics 
used to monitor convergence) are often not stable, and cannot be reduced without actions 
that introduce more uncertainty?. 


* Some uncertainties are specific and relevant to only some problems, for example: 

— The interaction of buoyancy with turbulence. 

— The prediction of shockwaves and sonic/choked flow. 

— Porous model simulations, particularly the adequacy of the porous representation, its 
interaction with adjacent regions of non porous fluid, or the treatment of turbulence 
within the porous region. 

— In resolved turbulence simulations (such as LES), correctly preserving the (spatial and 
temporal) frequency content can be highly important in some applications (although is 
less important if only the mean flow fields are needed). 

— In multiphase simulations, the treatment of free surface motion has several options and 
can introduce significant uncertainties, especially in situations where surface tension is 
important. 

— When modelling thermal radiation, the chosen model and numerical settings can have 
a strong influence on the cost and accuracy of a simulation. 


Discretisation schemes and turbulence models are ‘categorical variables’, because they represent 
discrete choices, and there is no inherent ranking of their characteristics, so they cannot be propa- 
gated through a UQ method using an approach that samples a probability distribution. However it 
is possible to include them, as well as the numerical and mesh resolution uncertainties, if they can 
be estimated, using the methods discussed in Section 4.8 as epistemic uncertainties. The extent 
that this can be expected to be achieved for some of these topics is discussed in Section 3.5.2. 


Validation Data and Benchmarks 


High-fidelity Validation data that is relevant to a new problem to be modelled by CFD can be difficult 
to obtain. There are references to databases for data in Volume 1, Section 4.3.2 and reviewed in 
more detail by Frazer-Nash (2019, Section 7), which are particularly related to relatively simple 
flows (basic tests and SETs). 


When time and cost constraints are applied in an industrial context, the question will inevitably 
arise: Do | need to validate this? Can | trust that the CFD code will be good enough for this 
problem? The answer to this question depends on the problem at hand, and the consequences 
of using or placing trust in the answers, but it is possible to use the experiences reported from 
international benchmarking activities to guide this. 


The IAEA have published studies relating to reactor and fuel analysis (IAEA, 2014, IAEA, 2020a, 
IAEA, 2020b), and a longer and broader series of benchmarks have been organised by the CSNI. 


For example using first order ‘upwind’ differencing for the convection term in the momentum equation damps oscillatory 
processes, but introducing this diffusivity makes the velocity solution less accurate. 
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In the benchmark studies a number of groups each simulate a well characterised laboratory ex- 
periment that has high-fidelity data available, using methods of their choice. The topics covered 
are: 


Mixing at a T-junction, giving rise to thermal fatigue (CSNI, 2011b). 


Turbulence downstream of a rod bundle spacer grid (CSNI, 2013). 


Mixing of a buoyant jet with a stratified layer in a large volume (CSNI, 2016a). 


Turbulent mixing of adjacent streams with a density difference, including the application of 
UQ (CSNI, 2019). 


* The buoyancy driven mixing of fluid in a scaled representation of the cold leg and downcomer 
of a PWR (Orea et a/., 2020, CSNI report not yet released at time of writing). 


¢ FSI of two inline cylinders, predicting the flow field and structural response (in progress at 
time of writing). 


A key feature of the exercises is that they have an ‘open’ and a ‘blind’ phase. In the open phase, 
which happens first, the experimental setup and results are provided and the quality of the pre- 
diction agreement can be judged by each group, and the modelling approach optimised. The blind 
phase follows this, where the same experimental apparatus is run with a difference in conditions. 
Only the setup is provided to participants and their predictions are submitted without being able to 
compare them to the experiment. This replicates some of the features of trying to make predictions 
in real simulations, because it tests how transferable the conclusions reached in the open phase, 
where validation data is available, to a similar condition where it is not. 


An objective of these benchmarks is to compare the predictions made by a range of methods, 
meshes and codes, that are chosen by the participants. This allows an assessment to be made 
of user effects (for example, choices relating to the location and characterisation of flow inlets) 
and comparisons to be drawn'? between CFD codes and turbulence modelling approaches, and 
some guidance on best practices to be developed for the item of plant at-hand. A notable feature of 
these benchmarks is, however, the variability in the predictions made in some of the cases. This is 
even the case where an effort has been made to define a ‘common model’ for some of the results 
(Andreani et a/., 2019). 


While they generate publicly available data, detailed official reports and additional publications from 
the participants (Cutrono Rakhimov et al., 2019, for example) as well as international collaboration 
and cooperation, these exercises take several years to complete and report. They also only address 
specific items of plant and phenomena. Therefore, while they serve a highly valuable purpose, if 
the problem to be modelled at hand is not covered by them, then their outputs and lessons can only 
be used as a starting point for asking questions, rather than providing readily applicable answers. 


Another conclusion that can be drawn from these benchmarks is that accumulating experience 
with a type of plant and its phenomena, and evaluating suitable meshing and modelling options is 
very important. However, relying solely on this experience when applying CFD to a new case has 
risks — there is still a need to exercise care and re-evaluate mesh and modelling decisions. 


10 or at least the dependency of predictive performance on these parameters is highlighted — often the various benchmark 
results have too many differences to draw firm conclusions. 
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Part of the difficulty in making transferable conclusions is that the comparison of the effects of 
differences in setup and results is not fine grained enough. There are too many physical processes 
occurring and too many modelling differences present at once to separate out what really makes 
the differences. For example it is noted in Andreani et a/. (2019) that unless the representation 
of the spreading of a jet as it enters the domain is correctly predicted (which can be assessed 
separately), the rest of the model prediction of that jet’s behaviour will be compromised. Studies 
A and D also provide examples of some of the challenges in comparing CFD models to validation 
data (where, in both cases, the experiments were designed specifically for CFD validation), and 
unravelling the contributions of and sensitivity to various physical and modelling aspects. 


Unless the the component parts of a simulation can be assessed and optimised, then the assembly 
of the validation arguments for the whole simulation may be compromised. This is aligned with the 
discussion in Section 3.2.4, where the most important stage of an analysis is ‘getting the basics 
right’, and knowing how costly or practical this is. It may be the case that adequately resolving each 
feature may be too costly. In the past, when computing hardware was a more significant limitation, 
then this was certainly true (Dzodzo, 2021). 


The conclusion from considering how benchmarks affect how to answer the question: Do | need 
to validate this? is to dismantle the whole problem into its parts and to reverse the question: what 
are my reasons to believe that this part is adequate? By doing this, a hierarchical approach can be 
assembled, whereby a larger and more complex problem with many features and phenomena can 
assembled and used for predictive purposes"', with some confidence that none of its components 
is going to contribute unbounded inaccuracies. System codes have been through this process over 
decades, and have gained acceptance despite containing substantial empirical elements, but CFD 
codes are still not as widely accepted for safety assessments. Applying UQ to simpler parts of more 
complex model hierarchy, and demonstrating that they can be used to establish the uncertainty in a 
realistic model may be the only method that can provide sufficient confidence to allow BEPU CFD 
simulations with safety significance to be accepted. 


CFD used for Safety Justification: Dry Spent Fuel Storage 


It is instructive to recognise that there are areas with nuclear safety implications where design and 
licensing decisions have been made with a significant contribution from CFD. The dry storage of 
spent LWR fuel in pressurised canisters requires the Peak Cladding Temperature (PCT) of fuel 
elements at the centre of these canisters to be predicted with confidence, so that it is assured that 
they will not fail in handling or storage. This is an example of the type of output metric discussed 
in Section 2.1.3 where the PCT is the hottest part of the hottest fuel assembly (hence is in the tail 
of a probability distribution) which will contain modelling uncertainty, and the probability of failure 
of a fuel pin will also depend on its internal pressure, which will be represented as a probability 
distribution of its burnup history, which will contain epistemic and aleatoric uncertainty. 


This is a difficult modelling task for lower fidelity models, like system codes, because it involves 3D 
Conjugate Heat Transfer (CHT), radiation heat transfer and buoyancy driven flows, where subtle 
pressure loss and driving force balances need to be established. However it is a tractable task 
for CFD because the equipment is not very large or complex, and so does not need a prohibitive 


11 This must be the objective for any modelling activity, otherwise if complete and prototypical validation is needed for each 
simulation, then modelling would serve little purpose, and the test results would simply be used. 
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number of cells. The fuel bundle can also be represented with a porous medium model, radiation 
and conduction can be included in most CFD codes, and the buoyancy and pressure loss can be 
modelled without substantial approximation. 


This has led to dry spent fuel storage being a ready-made example, with documents published 
describing specific CFD best practice for passive cooling and CHT, plus consideration of validation 
and uncertainty (US NRC, 2013, US NRC, 2019a, Li and Liu, 2016. This also demonstrates the 
point that NTH does not need to be restricted to analysis of in-reactor primary circuit components, 
and is required for a range of applications. 


Applying Polynomial Chaos 


Polynomial Chaos Expansion (PCE) is one of the UQ tools most often applied to CFD in recent 
literature, relating to, for example, aerodynamics (Hosder and Walters, 2007, Schaefer et a/., 2016) 
as well as its application to NTH (CSNI, 2016b, Khalil ef a/., 2018, Wenig ef a/., 2021). The concepts 
of PCE are introduced in Section 4.4.2.4. 


PCE can be relatively easily applied to create surrogate models for a small number of output 
metrics from a simulation, such as maximum values, combined/integrated values or values at a 
specific location. This provides the mean and standard deviation of the output metric, the Sobol 
indices for the influence of the uncertain parameters, and the surrogate models can be further 
sampled to determine the CDF of the outputs. The uncertain parameters chosen to vary, and the 
number and distribution of CFD solutions required will need to be chosen carefully, particularly if 
they are expensive to evaluate. Study B provides an example of this. 


It is also possible to apply PCE to CFD results more broadly, either determining the uncertainty 
and sensitivity at all points along particular sampling lines across the domain, or at every location 
on the surface or in the volume of a mesh. This allows the uncertainty (or sensitivity) in the velocity, 
pressure, or temperature fields (for example) to be visualised using the normal methods of post- 
processing. There are ‘intrusive’ methods that achieve this, where the PCE is implemented within 
and by modifying the underlying equations in the CFD source code, and only one solution is needed 
of the revised equation set (Najm, 2009). However, the source code may not be available, and the 
modifications are likely to be difficult and time consuming, particularly for problems with complex 
physics, so are unlikely to be applied by industrial users. There are also ‘non-intrusive’ methods 
where the CFD code is used without modification, and a sampled set of solutions are performed; 
a PCE model is fitted at each location of choice (which may be all locations in the mesh). 


Developing a PCE result to allow the uncertainty at any point in a flow field should work well for 
steady-state flows or unstable or transient flows with well converged time-averaged fields. How- 
ever, a flow may contain sudden switches in flow regime, such as large changes in the presence, 
location or size of separation, multiple possible stable states (for example a jet attaching to ei- 
ther one wall or another of an expanding duct), or laminar-turbulent transition over the range of 
uncertain parameters studied. In this case, interpreting the ‘uncertainty’ simply from the statistics 
may be misleading, or give an incomplete or substandard insight (the mean of a result with two 
distinct states will not represent the physical behaviour of either). It may be necessary to form 
sub-populations of solutions grouped by one of these distinguishing regimes, and then gain more 
understanding by studying uncertainty within and between groups, including finding the interface 
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(‘surface’ in parameter space) where the switches occur. 


There are also no widely used commercial or open source CFD codes that currently incorporate 
full-field PCE functionality, and there are no mature and available interfaces to toolkits (such as 
Dakota) that are able to read the CFD results files for each evaluation, generate the PCE models, 
and then write data out in a suitable format for visualisation. Therefore, applying non-intrusive PCE 
currently requires an investment in capability development. This is not facilitated by available litera- 
ture, which is dominated by descriptions that approach the topic from a mathematical perspective, 
rather than with a more approachable implementation and application focus. 


Proper Orthogonal Decomposition and Machine Learning 


Runchal (2020, CFD of the Future: Year 2025 and Beyond), reflecting on 50 years of CFD and its 
future prospects predicts that Machine Learning (ML) will impact many aspects of it: data reduction, 
model interpretation and post-processing, solution algorithms, as well as data driven turbulence 
models. 


Notable types of surrogate model receiving prominence in CFD are Proper Orthogonal Decompo- 
sition (POD), as a form of model order reduction (Section 4.4.2.6), and those associated with ML 
— Kriging (Section 4.4.2.5) and neural networks. These are able to use CFD simulation results to 
derive or ‘train’ a surrogate model, which can then rapidly'* emulate either the time evolution of the 
same flow system, or its response to parametric variation, or both together. 


Like UQ, the available literature on POD is not generally aimed at practising engineers, however, 
the tutorial by Weiss (2019) serves as a straightforward entry point to its application, and the 
overview by Cordier and Bergmann (2008) provides more detail and discussion that is still ap- 
proachable. Most CFD codes also do not provide this functionality natively, but the toolbox used 
by Georgaka et al. (2020) is open source’, and the functions needed to perform POD are ma- 
trix operations and the determination of eigenvalues and eigenvectors, which is functionality that 
is commonly available (in MATLAB or Python for example). Georgaka et al. (2020) describes the 
application of POD to thermal mixing at a T-junction, and Dolci and Arina (2016) provides a descrip- 
tion of the application of POD methods for aerodynamics. The structure of flows in fuel assembly 
rod bundles and T-junctions have been studied with the assistance of POD (Merzari and Ninokata, 
2011, Merzari et a/., 2013) and it can be equally well applied to experimental results of similar 
geometries — for example Nguyen et a/. (2020) extracted the dominant flow structures in the pres- 
ences of localised blockages in a rod bundle. 


How to identify the dominant (most energetic) modes in turbulent flows, interpret their structure and 
emulate or reconstruct them is a focus of most of the attention for ML. Approaches are proposed 
that aim to use ML methods to improve turbulence models (Duraisamy et al., 2019) and there are 
methods that aim to avoid modelling turbulence in the conventional way altogether (Hanna et al., 
2020), replacing it entirely with data driven models. An example combining many of the aspects 
discussed are the methods demonstrated by Yeh et al. (2018) and Chang et al. (2019), where the 
data from LES solutions (selected using a DoE method) are reduced using POD and using Kriging 


12 Potentially three or four orders of magnitude faster than the original CFD. 
13 mathlab.sissa.it/ithaca-fv 
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(aided by Sobol indices) to build a physics-based emulator of the flow in a rocket engine injector, 
to allow the design space to be surveyed. 


It should be noted that, like UQ methods, the additional complexity and sophistication of POD and 
ML comes with additional costs — if they play a part in a simulation used to make a decision with 
significant consequence, then a proportionate level of effort is required to demonstrate that they 
are functioning correctly, to estimate the errors and uncertainties that they contain, and to under- 
stand where they reach their trustworthy emulation limits. This will be challenging for non-specialist 
users, and, because their internal operation can be hard to inspect, may prove prohibitively hard to 
demonstrate when modelling a high-consequence, safety critical scenario. 


Applying Effort in the Right Place 


When considering how and when to apply the methods discussed throughout this technical vol- 
ume, it is essential to remember to not become embroiled in or enamoured with their mathematical 
and numerical details, at the expense of maintaining a clear link to reality for any V&V and UQ. Un- 
derstanding the uncertainty rigorously and comprehensively on a relatively small model is tractable 
(including the sources of numerical and modelling uncertainty), but adding together input and ma- 
terial property uncertainty, as well as a range of potential plant conditions makes fully addressing 
UQ for all aspects simultaneously expensive, and risks making it hard to form a clear picture of the 
main effects and where to focus efforts. A progressive approach, building understanding piece-by- 
piece is likely to provide more insight, but cannot necessarily maintain completeness or rigour. 


This section intends to provide a discussion about common error traps, misconceptions and omis- 
sion of sources of uncertainty, as well as potentially misdirected efforts, which can arise if the links 
to reality and context of an analysis are not sufficiently borne in mind. It is also intended to provide 
some prompts to answer the last question posed at the start of Section 1 — how much effort is 
warranted... ? 


Change Control and the Validity of Geometry and Input 


In a design, construction or licensing process, or during operation, changes will be made to plant 
geometry, operating conditions, the characterisation of transients, faults and hazards, or to safety 
thresholds and limits. These changes may invalidate previous NTH assessments, requiring them 
to be repeated. While this could be costly, it is less problematic or insidious than decisions being 
made using the results of assessments that were made using obsolete or invalid information. This 
could easily be a larger source of discrepancy than the other uncertainties in the prediction. 


Preventing this error trap with confidence requires robust change management processes to ‘pull’ 
and ‘push’ the necessary updates: 


« An analyst should ask themselves: Am / using the latest geometry or conditions? How can 
| traceably define the state of a model's inputs, and compare them to the current source of 
‘truth’? 


* The significance of some changes that are made in the design or manufacturing stage for a 
component may be obvious to predict with respect to their effect on the NTH behaviour, or the 
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necessity for a specialist review may be covered by change management rules. However, it 
can be the case that changes to small features, such as to the presence or absence of 
chamfers at openings, or the introduction of contractions or geometry changes that lead to 
reduced open areas in flow paths may not be noted as requiring assessment. A similar point 
applies to anticipating the consequences of changes in operating conditions. 


These are both potential sources of unsafe latent conditions — holes in the ‘Swiss cheese’ (Volume 
1, Section 2.2.3). While it is an organisational (rather than a modelling) challenge to recognise, 
anticipate and prevent them, experience has shown that, for the latter type, the significance of 
some changes for plant flows and heat transfer can be missed, misunderstood or underestimated 
by engineers from other disciplines, who are not specialised in NTH. 


Code and Solution Verification vs. Practical CFD Models 


Significant attention is paid in the V&V literature to code verification, and to demonstrating the ‘con- 


vergence order’'4 


of a software tool. Code verification of the implementation of a method assesses 
its convergence order compared to the theoretical behaviour, either by comparison to analytical 
solutions, or by using MMS. Approaches are well advanced in a research context (Roache, 1997, 
Salari and Knupp, 2000) and defined by ASME (2009, V&V 20). The methods and techniques are 
also still being developed and expanded (Krueger et a/., 2019). Guidance on calculating error es- 
timates and demonstrating grid convergence for solution verification is also well established, (e.g. 


using GCI, Section 3.4.1). 


While important, much of the formal code and solution verification guidance is applicable largely 
to software developers. Ideally it would be made available by the developers of CFD codes, and 
potentially run by users as acceptance, and confidence building tests. However, most CFD users 
are not developers, and most real problems rarely possess the straightforward and benign problem 
characteristics that allow formal order of accuracy assessments and GCI to be pragmatic and 
informative. Practical models often need to have automated generation of complex unstructured 
meshes, with local refinements and mesh quality optimisations, and can exhibit turbulence model 
behaviour and large scale or local flow instabilities that spoil the idealisation. It is also the case 
that the expected formal convergence order can be mathematically proven for simple idealised 
geometries, but are unlikely to be for 3D meshes of cells that are skewed/non-orthogonal, or are 
complex polyhedral shapes, and apply more complex numerical discretisation methods. 


Judging mesh resolution adequacy can be hard, and formally converging the mesh (via Richardson 
extrapolation, for example) is often impractical for mesh size reasons. Users will need to consider 
whether they could ever afford to properly converge a whole domain (especially for a transient, 
or when needing statistically meaningful and stable averages for an unstable flow). For a large or 
complex geometry, computing limitations or costs mean that it is likely to be only worth applying 
effort to considering the resolution in some key areas of interest. 


Code verification is partly used to capture implementation errors (bugs) in numerical algorithms. 
Bugs in modelling tools can lead to significant errors in a range of areas, not just related to the 


14 Numerical PDE solutions equations which are truncated via a Taylor-series expansion with a certain number of terms, for 
example, have an error term that depends on the size of a cell, 6x or timestep 4t, and the order, p, is the rate at which the 
error reduces with refinement, e.g. with dx? or dt. 
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solution of the PDEs (for example in post-processing). Open source CFD tools exist, of varying 
quality and maturity, but bugs are also still present in established commercial/proprietary codes. 
The contents and quality of any code are a product of its history and application, and new or 
infrequently used functionality is riskier to rely on. Open source tools typically have different types 
of bug and failure mode compared to commercial codes, but the availability of the source code 
means that users can find and understand them. Error reporting mechanisms exist for both open 
source and commercial codes, and these should be available to and routinely scrutinised by users 
of the software. The error report will state the versions of the code affected by the error, the affected 
functionality and consequences of the problem, ways to work-around or avoid it, and when the 
problem was rectified. 


This discussion also needs to be put in the context of considering whether the numerical and mesh 
errors make a significant contribution to the overall uncertainty or assessment of the results of a 
specific CFD model: 


+ If the uncertainty in inputs is large, then there is less point in being concerned about smaller 
uncertainties in the model. If the model and numerical uncertainties are dominant, then they 
do require attention. 


« Likewise, if the margin available to a safety or performance limit is large, and the model 
behaviour is benign (there are no ‘cliff-edges’ in its predictions), then applying a significant 
effort is not beneficial. 


These points should be borne in mind when considering the extent of solution verification neces- 
sary, as well as the fact that, in practice, it is difficult. A problem may need to be under-resolved for 
pragmatic reasons, and high quality validation data may not exist. 


Unsteadiness in Steady-State CFD Solutions 


If a flow is complex, for example with bluff bodies and separation, dividing and combining flow, or 
large buoyant plumes, then a CFD solution will always be inherently unstable or oscillatory to some 
extent. It is often the case that these applications are attempted to be solved with steady-state 
solutions, because the unsteady effects can be localised and creating a statistically converged 
transient result to obtain averaged results can be significantly more expensive. In the steady solu- 
tion approach, it is typical to see variation in the quantity of interest — ‘jumpy’ behaviour of monitors 
of pressure, force, temperature etc, or poor residual convergence behaviour. It is tempting to try to 
avoid this behaviour, for example by using lower order discretisation or reducing under-relaxation 
factors, which can suppress dominant numerical instabilities, and allow a converged solution to be 
found. However, by taking this approach too far, it is also possible to artificially suppress the jumps 
by inhibiting the physically unstable features of the solution from progressing or varying. 


When this unsteady behaviour is seen, then it can be beneficial to change the simulation to a 
transient as a comparison case, comparing the averaged flow fields or monitored values to their 
steady versions'®. In some instances, running transiently makes no difference to the results; in 
15 The concept of the time average of a fluctuating quantity being equal to the ‘ensemble’ average of many independent shorter 
realisations of this quantity is known as the quantity being ‘ergodic’. Ergodicity has a precise meaning in the statistical 
description of turbulence, related to the Reynolds decomposition underlying the definition of RANS (Deissler, 1992). The 


description here relating to practical comparisons of steady and unsteady flows is too ‘hand-wavy’ to be considered a 
realisation or test of the ergodic nature of the flow — more care and rigour would be needed (Makarashvili et a/., 2017). 
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others it makes a substantial difference. Repeating this judgement in a quantifiable way with robust 
statistics to pursue a formal mesh refinement is unlikely to be tractable, and will rarely be practical 
in an industrial analysis context (with the computational resources currently typically available). 


It is not possible to give general guidance on when these more complex cases occur. It should 
be considered on a case-by-case basis, and it is the responsibility of the analyst to look for and 
demonstrate that these issues are not present. In some cases, it will be necessary to solve URANS 
instead of steady-state RANS and time average. In some cases, a resolved turbulence approach, 
such as LES will be necessary. Either way, it should not be considered sufficient to simply focus on 
mesh refinement (global or local) to assess and assert that the solution is adequate. 


Should UQ be Applied to Every Analysis? 


This technical volume is intended to provide sufficient introduction and motivation to allow practis- 
ing NTH engineers to think clearly about the confidence that is placed in their simulation results, 
and indicate the tools to determine and express the uncertainty in them. These final remarks are 
intended to provoke some consideration of how UQ will fit into their circumstance before committing 
to (or advocating that) it should be applied to everything: 


* UQ does not ‘come for free’ — it can be expensive, and adds the additional burden of asking 
questions not only of the adequacy of a simulation, but also whether the UQ methods and 
outcomes are adequate. If it is going to be relied on to aid a decision or justification, it has to 
be high quality, good enough, trustworthy enough to justify that. 


« For organisations and users of analysis results who are used to making assessments with 
deterministic analysis results, then applying a probabilistic uncertainty to the result can un- 
dermine confidence in the prediction, not add to it. It is understandable to feel that the pre- 
cision of an answer produced by a long established ‘baseline’ best estimate model implies 
its accuracy and certainty. It is not necessarily welcome to be confronted with a (potentially 
large) uncertainty estimate, where there was none before. The stakeholders in an analysis 
may need to have the implications and the potential benefits of UQ carefully explained. 


« It can be hard to define an acceptable or appropriate level of uncertainty in an analysis — 
should you choose 2o or 3a, for example? This choice is likely to be driven by external 
requirements, such as that of a safety case, but understanding what a level of uncertainty 
represents in practical terms and its consequence can often be difficult to establish. 


« When establishing an overall safety position, it might be necessary to consider uncertainty 
from multiple analyses, each have their own separate, and potentially different assessment 
method (some may be BEPU, others conservative). In this case the overall uncertainty can be 
difficult to interpret and may lead to contradictory or an overly onerous or restrictive position 
(where too many uncertainties are concatenated on analysis and safety limits). 


* Doing some sensitivity analysis and uncertainty quantification is better than doing none — 
perfection is neither economic nor possible for real problems. However, as long as a model 
is capable of reliably producing acceptable agreement with experimental data, making a 
judgement with some knowledge of the uncertainty, starting with a focus on the choice of 
uncertain inputs and defining their range/distribution, invariably adds to the understanding of 
a system’s behaviour. 


50 of 109 


4.1 


Mathematical Methods for Propagating 
Uncertainty 


Conceptual Statistical Framework — Frequentist vs Bayesian 


Statistical methods generally fall into one of two categories: frequentist statistics or Bayesian statis- 
tics’. 


* In a frequentist approach the probability of a given event is based on a set of observed 
measurements of the event. In frequentist analysis, the parameters of a probability model 
(e.g. the mean and standard deviation of a normal distribution) have a fixed unknown value 
and are often represented by an estimated value and its confidence interval. Statistical (or 
null) hypothesis testing and p-values* are used to determine if a given event is statistically 
significant given the observed data (DeGroot and Schervish, 2012). 


« In a Bayesian approach the parameters of a probability model are generally treated as un- 
certain quantities themselves, represented by probability distributions (Gelman et al., 2014, 
Kruschke, 2015). The parameter values are calculated by combining assumed prior knowl- 
edge about the event, which may include subjective beliefs, with available data. During the 
calculation, the ‘prior’ distribution (or hypothesis) is updated, using numerical methods, with 
knowledge taken from additional data (or evidence) to define the the ‘posterior’ probability 
distribution that an event will occur. 


For example, suppose that an uncertain input variable follows a normal distribution, with the dis- 
tribution parameters and o representing the mean and standard deviation respectively. In a 
frequentist approach, the limited amount of measurement data collected defines a central estimate 
and confidence interval on w and o. As more data is collected the size of these confidence inter- 
vals reduce. In a Bayesian approach, pw and o are treated as random variables defined by posterior 
probability distributions which are a result of the assumed prior distribution, updated for knowledge 
regarding the measurement data. As more data is collected, the analysis and posterior distribution 
is updated to reflect this. 


Statistical software tools are available with implementations of many of the algorithms to perform 
frequentist and Bayesian statistical analysis, and either approach may be used depending on cir- 
cumstance and the preference of the user. In many simple cases for characterising an uncertain 
input the results will be similar. Frequentist methods tend to be used more commonly, and Bayesian 
Choosing one or the other is loosely analogous to choosing whether to use the finite element or finite volume approach for 
discretising PDEs for CFD — the implementation details for any methods stem from this choice. 


p-values are an estimate of the probability that a particular result, or a result more extreme than the observed value could 
of occurred by chance. Crawley (2015) 
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methods can be more computationally expensive and can require more expert knowledge, but are 
becoming increasingly used. Bayesian methods can also be highly effective when there is little 
data available to inform decision making (for example to support expert elicitation). 


In this technical volume the majority of the methods presented are frequentist in nature (specifically 
Monte-Carlo propagation of uncertainty). However, a description of how Bayesian approaches may 
be used to help characterise uncertain inputs is presented alongside a range of other methods in 
Section 4.2.4, and it is also possible to make use of some of the topics discussed within a Bayesian 
approach (Wu et al., 2018). 


Parametric Uncertainty Characterisation 


Propagating uncertainty through a calculation methodology needs each identified source of input 
uncertainty to be appropriately characterised. In this process, each uncertain input is assigned a 
mathematical structure that describes the nature of its uncertainty, and specific numerical param- 
eters for that structure defined (Roy and Oberkampf, 2011). 


Aleatory uncertain inputs are most often characterised by a continuous PDF which represents 
the likelihood of the uncertain parameter as a function of different values. For example, the 
structure could be a normal distribution, and the parameters are the mean and standard 
deviation. However, more complex distribution forms could be used if required (for example, 
an asymmetric Generalised Extreme Value distribution (defined by location, scale and shape 
parameters) or the beta distribution (defined by two positive shape parameters in the interval 
[0, 1]). 


Epistemic uncertain inputs should be conceptually separated from aleatory uncertain inputs. 
Epistemic uncertainties are commonly represented by an interval or take a limited number 
of discrete (or categorical) values, but some forms may also be able to be represented by a 
continuous PDF. 


In practice, some uncertain inputs will be a combination of both aleatory and epistemic uncertainty 
and methods which account for both should be considered when characterising and propagat- 
ing the uncertainty. For example, when modelling graphite, the porous structure of the material 
means that the thermal conductivity can only be represented by a probability distribution, and is 
hence aleatoric in nature. However, the thermal conductivity of graphite changes due to irradiation 
damage, and making measurements of irradiated graphite specimens is difficult, so only limited 
experimental data is likely to be available. Hence, there will be epistemic uncertainty on the pa- 
rameters defining the probability distribution of the aleatory uncertainty for thermal conductivity at 
a given irradiation. 


Considerations When Characterising an Uncertain Input 


The method for characterising uncertain inputs should consider several aspects: 
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Where does the uncertainty originate? 


The source of the input parameter uncertainty is significant in determining how it should be treated. 
For example, an uncertain input based on experiment can be defined by a well quantified PDF 
fitted to the available data, or it could be derived from relevant published experimental data. For 
example, in a model where a thermophysical property of a fluid is a key uncertain input, then a set 
of experimental measurements could be undertaken to define this value. The uncertainty on the 
experimental results can be used to characterise the simulation input uncertainty. The uncertain 
input may also be based on separate numerical calculations, modelling of a specific aspect of the 
system, formal expert elicitation methods, or as part of the PIRT process. A modelling choice or 
input may need to be based only on the user's (or an advisor’s) expert judgement and there may 
be only limited evidence to support a credible quantitative uncertainty associated with the value. 


Is the uncertainty aleatory or epistemic? 


Uncertain inputs which can be described by continuous distributions are generally easier to prop- 
agate through to an output metric uncertainty. Aleatory uncertainties are usually specified by con- 
tinuous distributions. It may also be possible to represent epistemic uncertainties by a continuous 
distribution and treat them in the same manner as aleatory uncertainties. Indeed, the distinction 
between aleatory and epistemic uncertainty may be dependent on the model. In this case the char- 
acterisation of the uncertainty as aleatory or epistemic should be a pragmatic choice, dependent 
on the purpose, to allow the uncertainty to be most easily represented by a continuous probability 
distribution. Kiureghian and Ditlevsen (2009) remark on the confusion that can occur when try- 
ing to assign an uncertainty to be aleatory or epistemic, or to determine when additional effort or 
knowledge can convert epistemic to aleatory uncertainty: 


... we came to the conclusion that these concepts only make unambiguous sense if 
they are defined within the confines of a model of analysis. In one model an addressed 
uncertainty may be aleatory, in another model it may be epistemic. So, the character- 
ization of uncertainty becomes a pragmatic choice dependent on the purpose of the 
application. 

The distinction is useful for identifying sources of uncertainty that can be reduced in 
near-term, i.e. without waiting for major advances to occur in scientific knowledge. .. The 
distinction is also important from the viewpoint of transparency in decision-making, 
since it then becomes clear as to which reducible uncertainties have been left unre- 
duced by our decisions... [and] can be further reduced, albeit at a cost that may not 
be justifiable. 

More importantly, epistemic uncertainties may introduce dependence among random 
events, which may not be properly noted if the character of uncertainties is not correctly 
modelled. 
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What is the impact on the complexity and computational cost of the uncertainty 
propagation? 


Epistemic uncertain inputs which are represented by an interval or a set of discrete choices can be 
more difficult to accommodate than aleatory uncertain inputs. For example, discrete distributions 
can lead to discontinuities in surrogate models which may make them hard to apply, requiring 
more costly full model evaluations instead. As a result, more complex analysis may be required to 
propagate the uncertainty through to a final output metric, as discussed in Section 4.8. 


Is there combined aleatory and epistemic uncertainty? 


When characterising an uncertain input both epistemic and aleatory uncertainty may need to be 
considered. Most often this is required if an uncertain input is random in nature, but there is epis- 
temic uncertainty in how this random nature behaves. This might be because of: 


Parameter uncertainty: The parameters required by the mathematical structure will themselves 
contain epistemic uncertainty, generally expressed by the mean and standard deviation of the 
coefficients. This represents the uncertainty on the average behaviour of mathematical struc- 
ture in describing the behaviour of the uncertain input. The uncertainty in these parameters 
can be characterised and propagated through by UQ. However, the epistemic uncertainty on 
the parameters may be neglected if this is significantly smaller than the aleatory uncertainty 
and if the aleatory uncertainty is important for the output metric. 


Functional form of the mathematical structure: If different functional forms can be used to de- 
fine the mathematical structure of the uncertain input (including the aleatory uncertainty) 
there may be epistemic uncertainty to consider in the UQ. The best performing functional 
form should be evaluated where possible, but alternatives considered as part of the UQ. 
Uncertainty in the mathematical structure would need to be treated as an epistemic un- 
certainty. This is particularly important if model extrapolation is required. Finally, it may be 
necessary to include epistemic uncertainty on the functional form and on the parameters 
discussed above in the UQ. 


Extrapolation: If the uncertain input in the model is outside the range of the data used to deter- 
mine its parameters, then extrapolation of the mathematical structure is required. This should 
be avoided if possible, because uncertainty can increase rapidly outside of the range of avail- 
able data. It is likely that additional justifications that the mathematical structure chosen ex- 
tends to the required domain will be necessary (for example by embodying the appropriate 
physics) and that the additional uncertainties that are created by this are considered. 


Model calibration: If a model is ‘tuned’ to measurement data from the system being modelled 
via calibration to derive or adjust a parameter, this may make the uncertainty analysis more 
complex. Changes to the model inputs will inevitably lead to a change in the model cali- 
bration parameter. As a result, the parameter will likely need to be recalibrated against the 
measurement data for each model sample run. 


In cases where there are combined aleatory and epistemic uncertainties, the uncertain input may 
be characterised by a p-box (described in Section 4.2.4.12). 


3 For example, by examining the Akaike Information Criterion (AIC) (Vose, 2008) which is a relative measure of ‘goodness of 
fit’ for statistical models using the same data (with a penalty for increasing the number of coefficients). 
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Example of Considerations for an Uncertain Input 


An example of a continuous uncertain variable is a thermophysical property of a material (for exam- 
ple, a thermal conductivity). In a NTH model, the mean value from a set of published experimental 
measurements may be used. However, the experimental uncertainty means that the true thermal 
conductivity is within a continuous uncertain range which must be parametrised. Two potential 
scenarios which could be considered are: 


Aleatory uncertainty is not significant: This occurs if the material is homogenous and its man- 
ufacturing process is well controlled (such that a single value of true thermal conductivity 
value is valid for all the material) or the output metric of importance is only sensitive to the 
mean value. In this case the uncertainty on the mean thermal conductivity is only due to 
there being a limited amount of data to characterise it. Although, this may strictly be an epis- 
temic uncertainty (it could be reduced with more measurement data) it is unlikely that further 
or improved experimental measurements are possible in the short term and the uncertainty 
on the mean thermal conductivity can be characterised by a continuous PDF. In the most 
simple case this might be via a normal distribution with central estimate given by the mean 
of the experimental data and a standard deviation characterised by the standard error on the 
mean. 


Aleatory uncertainty is significant: This might occur if the material is non-homogenous or there 
is intrinsic variability in the component manufacturing (for example, the thermal conductivity 
of graphite in a large number of moderator bricks), and the output metric of interest is sensi- 
tive to extreme values of the aleatory uncertainty. In this case the uncertainty in the aleatory 
distribution must be characterised (i.e. how uncertain is the standard deviation of the distribu- 
tion of thermal conductivity?) This is significantly more complex and requires both epistemic 
(uncertainties on the parameter estimates) and aleatory (the distribution width) uncertainties 
to be propagated to the output metric. 


This is illustrated in Figure 4.1. In Figure 4.1a an input with aleatory uncertainty is characterised by 
a normal PDF fitted to the measurement data (the purple line). However, due to the limited amount 
of measurement data (represented by the histogram), there is uncertainty in the normal distribution 
that best describes the data, resulting in many possible normal distributions (the grey lines) which 
could also describe the underlying aleatory distribution. This represents epistemic uncertainty on 
the parameters of the normal PDF. If the output metric uncertainty is only dependent on the mean 
value of the uncertain input, then the confidence interval on the mean of the input can be used to 
characterise the uncertain input. However, if the entire aleatory uncertainty distribution is important 
for output metric uncertainty, then consideration also needs to be made of the epistemic uncertainty 
in the fitted distribution, and propagated through the UQ method. This uncertainty will likely be 
significant larger in the distribution tails compared to the central estimate. Figure 4.1b illustrates 
how taking more experimental measurements will improve knowledge of the aleatory distribution 
form and reduce the epistemic uncertainty on the normal PDF distribution parameters. 
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Figure 4.1: Example of an aleatory uncertain input and the impact that limited input data 
has on the uncertainty of defining the mean, uw, and standard deviation, o, of a PDF of 
the input. The grey lines show a set of alternative normal PDFs that could represent the 
data, which result in epistemic uncertainty in the mean and 2c of the distribution. 


Central Limit Theorem 


The Central Limit Theorem (CLT) is the most important result in statistics (Navidi, 2020) and states 
that the mean, x for a large number (NV) of independent samples from an underlying distribution, 
f(x), will be normally distributed (Vose, 2008): 


X = normal(pu, o/VN) 


where yz and o are the mean and standard deviation of distribution f(x). This holds regardless 
of the functional form of the underlying distribution f(x). The CLT also holds when calculating the 
mean from a large number of samples from different distributions, provided no one distribution 
dominates. 


The CLT has two important consequences for an UQ assessment. Firstly, when characterising the 
uncertain inputs, uncertainty on the mean value can, in most circumstances, be described by a 
normal distribution. Secondly, the output metric uncertainty is likely to be approximately normally 
distributed, unless the underlying functional response of output metric in the NTH analysis is highly 
non-linear. 


Characterising Uncertain Inputs 


In general, an uncertain input can be described in a probabilistic manner (predominately aleatory 
uncertainty), by an interval (epistemic uncertainty), or by an imprecise distribution (combined 
aleatory and epistemic uncertainty, Roy and Oberkampf, 2011). Several common statistical tech- 
niques and methods to help characterise an uncertain input are discussed below. 
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Descriptive Statistics 


The simplest method to characterise an uncertain input is to use descriptive statistical estimators. 
Supposing there are ; independent samples of an uncertain input available (from experiment or 
other analytical models), and assuming that the data is normally distributed, then the mean and 
variance can be used to characterise the uncertain input. However, because there are only a limited 
number of data samples there is uncertainty in the estimators, commonly known as the standard 
error. These are equivalent to statistical fluctuations (or the standard deviation of) the estimator. 
Table 4.1 shows expressions for how to calculate the most common estimators and their standard 
error. In practice, it may be more appropriate and accurate to use numerical techniques (such 
as bootstrapping, discussed in Section 4.2.4.9) to find the uncertainty on statistical estimators, 
especially for the variance. 
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Table 4.1: Simple descriptive statistical estimators of a distribution of data and the stan- 
dard error (or uncertainty) on the value as a result of having a limited sample of data 
(Ahn and Fessler, 2003). The estimate of the standard error of the mean, ox, is applica- 
ble to data representing any probability distribution. The estimate of the standard error 
in the variance, a2, is applicable to data with a normal distribution — estimates can be 
made for other distributions, but require the kurtosis to be estimated (Cho et al., 2005). 
Note that is it often preferable to calculate sample variance using the second variant, 
because it can be achieved in practice by accumulating running sums of X; and X? for 
a value. Using the first variant requires all X; values to be available at once, and so in 
a long transient calculation, for example, it would require data from all time points to be 
retained, because X is only available at the end. 


Analytical Uncertainty Propagation 


Some uncertain inputs to a model may be calculated based on other measurements. For simple 
expressions it is possible to analytically characterise the uncertain input derived from the other 
measurements using the standard equation for error propagation. For a function z = f(a, b,...) 
the standard deviation on the output z is given by using the partial derivatives expressing the 
sensitivity of the uncertain output to the input measurements 
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where o; is the standard deviation of measurement / and o;; is the covariance between measure- 


ments / and j. Where the measurements can be assumed to be not covariant, then the ‘cross- 
terms’ are zero. 
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Regression Methods 


Regression analysis can be used to find a functional relationship (f) between a dependent variable 
(Y;) and one or more independent variables (Xj): 


Y= T( AR Be 


where 6; are the unknown coefficients in the function and ¢; is an error term for un-modelled 
features and random variable noise. The most common forms of regression modelling are: 


* Ordinary (or linear) Least Squares Regression (OLS): A model where all the terms are a 
constant or a parameter multiplied by one or more independent variables raised to a power. 
Variable transforms can be performed to linearise certain functional forms. 


« Non-linear Least Squares Regression: A model where the the terms contain non-linear com- 
binations of the independent variables. For example, exponential and trigonometrical func- 
tions. 


In regression modelling the optimal set of coefficient values are found to match the data, normally 
by finding the global minimum of the sum of the square errors (making it closely related to ‘curve 
fitting’). 


In NTH, regression modelling is commonly used to determine correlations between dimensionless 
quantities or dependency of physical properties on environmental conditions. For example, regres- 
sion modelling could be used to determine a functional form between a thermophysical property 
(the dependent variable) and temperature (the independent variable) of a material based on a set 
of experimental measurements. There is extensive literature on performing regression modelling 
(DeGroot and Schervish, 2012) and it is included in statistical software packages. However, care 
should be taken that the underlying regression modelling assumptions hold* to ensure the model 
accurately represents the observed data and that the model is not over-parametrised?. 


The uncertainty in regression modelling is primarily characterised by the epistemic uncertainty in 
the fitted functional form, the epistemic uncertainty in the fitted coefficients (defined by the standard 
error in them), and the aleatory uncertainty in the observations. This is illustrated in Figure 4.2. 
The data is created from the (cubic) true function form, with the addition of Gaussian random noise 
(representing aleatory uncertainty). The dashed lines represent OLS fits using quadratic and cubic 
functional forms. The difference between the red and green lines represents epistemic uncertainty 
in the functional form. Neither fit is a particularly good representation of the true function around a 
value of 1 for the independent variable, because the aleatory uncertainty in the data (grey shaded 


The key assumptions of regression modelling are: 1) The residuals (differences between each datapoint and the regression 
mode/curve fit) should be normally distributed, meaning that large deviations have a lower probability of occurrence 2) each 
data point is independent and an equivalent/comparable measurement of the same quantity and 3) the magnitude of the 
residuals does not vary as a function of the independent variables (homoscedasticity). In practice, regression models and 
methods for fitting them are generally robust to moderate deviations from these assumptions, and more complex methods 
exist to account for and correct deviations (such as generalised and robust least-squares). 

An over-parameterised model can be one in which there are too many unknown coefficients for the amount of available data, 
and it is mathematically overdetermined. It can also be one where a regression function with more coefficients than are 
needed is used (even where a large number of data points are available). For example, fitting a high-order polynomial func- 
tion to data that is inherently quadratic introduces coefficients that have no purpose. In both cases, an over-parameterised 
regression model will attempt to describe the noise in the data as well as the underlying physical trend, which can impair 
its performance. Methods are available to assess whether each coefficient is significant, and to add them progressively 
(stepwise regression) so that unnecessary terms can be removed or avoided. 
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Figure 4.2: An example of the different forms of uncertainty in a linear regression model. 


region) is (randomly) biased above the true function. The epistemic uncertainty in the cubic fit 
model coefficients is represented by the purple shaded region. A better fit (reduced epistemic 
uncertainty in model coefficients and closer agreement to the true function) would be achieved 
by having more input data points. There is a significant increase in epistemic uncertainty in the 
model fit predictions above a value of 5, where there is no data, highlighting the difficulties and 
risks in extrapolating regression models beyond the range of their fitting data. The necessity to 
include one or more of these uncertainties (possibly in combination) must be decided based on the 
considerations outlined above. 


Maximum Likelihood Estimation 


Maximum Likelihood Estimation (MLE) is a method for choosing the set of parameters for an un- 
derlying statistical model by calculating the set of coefficients which give the most likely chance the 
observed data occurs. It is based on evaluating the likelihood function for x; data for a distribution 
function f and parameters 0: 


L(6) = I] F (x;, 0) 


This likelihood function represents the probability of observing all of the data with parameters 6 
(DeGroot and Schervish, 2012). The maximum likelihood estimator (i.e. the set of coefficients 6 
most likely to describe the underlying data for f) can be found by attempting to maximise the 
likelinood function. If it can be assumed that the data are independent samples from the same 
distribution, it is often more useful to work with the log-likelinood function for numerical reasons. 


The benefit of MLE is that it can provide a consistent approach in parameter estimation across a 
number of different distributions, and comparing the AIC between different models can give a good 
indication of the best distribution form. It should be noted that at small sample sizes MLE can lead 
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to bias in the estimates. 


As for regression modelling, there is extensive literature on performing MLE and it is included in 
statistical software packages. Furthermore: 


« The parameter uncertainties found from MLE must be considered and propagated through 
the UQ if necessary. 


* The form of the assumed distribution, f, also introduces uncertainty, particularly if the output 
metric is sensitive to the tails of the distribution which is poorly defined by the available data. 


Kernel Density Estimation 


Kernel Density Estimation (KDE) is a non-parametric representation® of the PDF of a distribution 
of data. It is useful if the distribution is not easy to model via a parametric distribution (e.g. if it is 
bimodal) or if it is desirable to not make specific assumptions about the form of the data. For x; 
data, the PDF of a KDE is calculated by 


where K(x, h) is the Kernel, which is a positive function with a bandwidth h. Common forms of 
K are Gaussian, exponential and tophat. The results of a KDE can be sensitive to the functional 
form of K and, more importantly, the bandwidth. The bandwidth controls the level of smoothing 
applied by the Kernel function. Higher bandwidths smooth the data more, but can lead to important 
features in the data being removed. However, many analytical software packages (such as those 
described in Section 2.2.2) have automated methods for choosing appropriate defaults. Once the 
KDE has been calculated it can be evaluated at any value of x to propagate through the uncertainty 
analysis. Uncertainty in the KDE could be calculated via bootstrap methods. 


Empirical Cumulative Distribution Function 


A CDF can be directly calculated from a set of independent measurements of a quantity, this is 
known as the empirical CDF (eCDF). Uncertainty on the eCDF can also be found using methods 
such as Greenwood’s formula and Kolmogorov-Smirnov bounds. Such methods are commonly 
used in survival/failure analysis (Lee and Wang, 2003) and can be used when there is a signifi- 
cant amount of available data or to better understand or analyse data prior to fitting a statistical 
distribution. 


Bayesian Inference 


The Bayesian inference process for characterising uncertain inputs is defined by two key-concepts 
(Kruschke, 2015, Gelman et al., 2014): 


1. Inference is the process of apportioning credibility (or likelihood) across a number of possi- 
bilities. 


A non-parameteric method is one which is not restricted to a defined distribution form. Instead, the method is numerical 
description based on the underlying data. 


60 of 109 


Volume 4 


4.2.4.8 


Mathematical Methods for Propagating Uncertainty 


2. The possibilities to which the credibility is assigned can be described by parameters in a 
mathematical model. 


The process for Bayesian inference is: 


* Define a probability model of the system. This is a combined (joint) probability model includ- 
ing uncertainties in measured quantities, uncertainties in model predicted output metrics, and 
in the parameters defining the uncertain input. This is represented by appropriate prior distri- 
butions — user-chosen mathematical models with initial assessments of their credibility. This 
aspect of the Bayesian modelling process is often the hardest. It requires subjective prior 
distributions to be assigned with a subjective credibility which the outcome of the Bayesian 
inference may be dependent on. 


* Inference (or updating) of the model to account for the known observations (the available 
data being used to construct the uncertain input). This involves numerical calculations (us- 
ing Bayes’ rule) to refine or update the credibilities to make them most consistent with the 
observed data. 


The output of the Bayesian inference is a set of posterior probability distributions which describe the 
likelinood of occurrence of each possibility. These probability distributions can then be directly used 
to characterise an uncertain input for a UQ propagation. The outcome of the Bayesian inference 
should be assessed to check that they are consistent with the observed data; if they are not, then 
a different mathematical model may be required. 


Expert Elicitation 


Expert elicitation methods can be used where there is a lack of alternative data to make a charac- 
terisation of an uncertain input. For example, where experimental measurements are not possible 
or are too expensive or time consuming that they are impractical. The required output for the expert 
elicitation process can be the same as the other methods discussed, a mathematical characterisa- 
tion of the uncertain input (Morgan, 2014). 


The process for expert elicitations is: 


* Define the uncertain input to characterise. The nature of the input to characterise will influ- 
ence the expert elicitation process. For some inputs there may be a very limited number of 
experts to consult. For others, the spectrum of available experts may be much broader. 


« Select appropriate experts with knowledge of the subject. It should be noted that the num- 
ber of experts for a particular field may be limited or be close collaborators meaning the 
independence of different experts should be considered. 


* Question the experts. These questions should be framed with the purpose of finding sub- 
jective probabilistic assessments of the uncertain input. This may be difficult, especially in 
circumstances where experts are unfamiliar or uncomfortable with making subjective proba- 
bility assessments. The use of qualitative words should be avoided where possible, because 
their interpretation may differ substantially between experts and contexts’. 


7 In anumber of contexts that requires consensus on meanings and consistent application of terms, such as intelligence and 
climate science (IPCC, 2010), the concept of ‘Words of estimative probability’ is used, where specific terms indicate a band 
of numerical probability. 
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* Formulate probability distribution. The answers to the questions are used to form a subjective 
probability distribution of the uncertain input for the individual experts. Iteration with the pre- 
vious step is possible, and initial responses of the experts may need further refinements to 
elicit their best subjective judgement. It is likely that the output probability distributions will be 
generated based on the experts’ indirect knowledge and point estimates, rather than a direct 
knowledge of the distribution form and parameters. This can be captured by using three- 
point approximations (Keefer and Bodily, 1983) or bracket median methods (Smith, 1993) 
for example. A number of tools (for example Morris et al., 2014) are available to support this 
activity. 


Combine results. The results of the experts are combined to form a final subjective probability 
distribution which defines the uncertain input. 


Expert elicitation is particularly prone to contextual bias, where the experts are unintentionally influ- 
enced by background information. Furthermore, there is evidence that expert judgements tend to 
be over-confident (i.e. the outcome is probability distributions which are too narrow, see Hemming 
et al., 2018). As a result, expert elicitation should generally be performed within a well defined 
framework or protocol to try and mitigate such bias. Example of such protocols include Delphi 
(Brown, 1967), Cooke (Cooke, 1991), IDEA (Hemming et a/., 2018) and SHELF (Gosling, 2018). 
One of the main differences between protocols is the level of interaction between experts. For ex- 
ample, in the Cooke protocol the subjective probability distributions of different experts are math- 
ematically aggregated with no interaction between them. Alternatively, some protocols (such as 
SHELF and IDEA) employ behavioural aggregation where the output probability distributions are a 
result of independent assessment and facilitated discussions between experts. Mathematical ag- 
gregation protocols may achieve faster results, as there is no need for a consensus to be reached 
and facilitated discussions are not required, but often require some form of subjective ranking to 
be applied (via a mathematical formula) to produce the combined distribution. 


The rigour and depth to which expert elicitation be adopted is dependent on the uncertain input in 
question and the resources available. Fully completing the protocols outlined above may be highly 
onerous, and it may be appropriate to adopt a more pragmatic approach to the expert elicitation 
process. For example, in its simplest guise, expert elicitation may involve consulting a single expert 
to set upper and lower bounds on an epistemic uncertainty which does not have dominant impact 
on the output metric. 


Expert elicitation can also be used in combination with Bayesian updating methods, where it can 
form the basis of the assumed prior distribution to the analysis. 


Bootstrapping and Jackknifing 


Bootstrapping and jackknifing (Efron and Gong, 1983, Efron and Tibshirani, 1993) are numerical 
methods to estimate the uncertainty for fitted parameters and distribution coefficients by resampling 
from a limited underlying dataset and recalculating the required metric. Supposing that there is a 
dataset X = {x1, Xo,...,x,}, where each x; is an set of independent observations of the same 
quantities from the underlying distributions, with a true quantity of interest F. 


In bootstrapping a large number, N, of new resampled datasets are created {X{, X3,..., Xj}. 
Each resampled dataset is the same size as the original, so it will contain repeat values of the 
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same x; and some x; will be missing, but each is considered to be representative of what might 
have been observed if a different set of observations had been taken. For each of the NV resam- 
pled bootstrap datasets the quantity of interest is calculated (such as the mean variance, or a 
parameter estimation via regression modelling). The result is a distribution of the quantity of inter- 
est {Fj, F5,..., Fy}, which can be used to define the uncertainty on the input. The bootstrapping 
technique is powerful because it can be applied easily to both simple and complex quantities of 
interest. Because bootstrapping is limited to the set of input measurement data, it cannot represent 
uncertainty due to systematic observation uncertainties, or if there is a subset of important outliers 
which have not been observed. 


In jackknifing, the quantity of interest is estimated n times, but excluding the x; data point from 
each subsample calculation. This is a systematic resampling, compared to the random resampling 
of bootstrapping. Where the bootstrap gives an indication of the uncertainty on the quantity of 
interest, jackknifing can give an estimate of the bias (i.e. whether the dataset is likely to over or 
under-estimate the parameter). If the jackknifing shows a likely bias in the sample statistic, it may 
be necessary to take this into account when characterising the uncertain input. 


Interval Range 


For purely epistemic uncertainties the uncertainty can be characterised by an interval range (Fer- 
son et al., 2007). Interval analysis defines an upper and lower bound associated with a quantity but 
little else is known about a quantity. Mathematically, a quantity X that can take any set of values 
within the set interval value is defined as 


X={x:a<x< b}=[a, 5] 


The interval may be based on quantitative information that defines the bound (for example using 
the statistical techniques outlined in this section to derive the uncertainty on the mean of a normal 
distribution) or on a subjective assessment of the bounds (e.g. from an expert elicitation process). 
The generation of further information (e.g. more measurements) could be used to reduce the size 
of the interval. Depending on the importance of the input uncertainty and amount of information 
available in defining the interval, it may be appropriate to apply a discrete (and justified) safety 
factor to the interval to ensure it covers the underlying uncertainty®. 


An extension of interval range analysis is the representation of a uncertain input by fuzzy logic or 
numbers in which the the level of possibility of a value is constructed via a set of nested intervals 
to produce a fuzzy set (Ross, 2010). The fuzzy number of a parameter a given value may take 
any real value between 0 and 1 inclusive and represents the possibility that the uncertain input lies 
within a fuzzy set (or the degree of uncertainty that the value belongs within the set). 


Rigorous algorithms exist which take interval values inputs and provide bounds on the accumulated rounding errors, ap- 
proximation errors, and propagated uncertainties (Moore et al., 2009), but need to be embedded throughout the calculation 
route, and are not covered in more detail here. 
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Discrete or Categorical Probabilities 


As noted in Section 2.1.2, for some epistemic uncertainties the NTH model may be represented 
by a set of discrete uncertain possibilities, such as the choice of turbulence model. This is rep- 
resentative of uncertainty in the underlying model form and can be modelled by a set of discrete 
input options, each with an assigned probability. The overall probability of these options should 
sum to one. As such, these types of uncertainty can be modelled by a categorical Probability Mass 
Function (PMF). 


The probabilities of the different options may be quantitative or subjective and it should be noted 
that the discrete options may not fully represent the whole uncertainty in the model formulation. 
However, characterisation of key modelling choices via a set of discrete options may allow im- 
portant aspects of model formulation to be propagated through the UQ and incorporated in the 
reported uncertainty on the output metric. 


Probability Bounds Analysis 


The characterisation of an input may need to be the combination of aleatory uncertainty (treated 
by probabilistic methods) and epistemic uncertainties (represented by discrete values or intervals). 
As discussed above, the parameters describing a probability distribution of an uncertain input are 
themselves uncertain (due to a limited amount of input data), and this needs to be considered in 
the overall UQ (Roy and Oberkampf, 2011). 


A combination of probability theory and interval arithmetic can give rise to probability boxes (p- 
boxes, Tucker and Ferson, 2006), which is similar to a CDF but have a finite width representing 
the epistemic uncertainty (this is also known as imprecise probability). A p-box comprises of two 
non-intersecting CDFs which represent the upper and lower bound CDFs which contain the family 
of all possible probability distributions. This is illustrated in Figure 4.3 which shows how a set 
of CDFs can be combined to give an overall uncertainty. A single CDF is the realisation of one 
possible aleatory uncertainty distribution and the set of CDFs incorporates the impact of interval or 
discrete epistemic uncertainties. The width of the resulting p-box (Figure 4.4) represents the range 
of parameter values that are possible for a given cumulative probability level, whereas the height of 
the p-box represents the range of interval-valued cumulative probabilities associated with a given 
parameter value. In the limit that there is little epistemic uncertainty then the p-box degenerates to 
a precise CDF representing just the aleatory uncertainty 


This is known as Probability Bounds Analysis (PBA) or Interval-Valued Probability (IVP), and it 
allows for a comprehensive description of both forms of uncertainty for an uncertain input. It is 
useful whenever the uncertainty about the distributions can be characterized by interval bounds 
about their cumulative distribution functions (Tucker and Ferson, 2006). The p-boxes represent 
the uncertainty about imperfectly known input distributions and these can be propagated through 
the model to identify a bounding p-box which encapsulates the entire uncertainty on the output 
metric. This approach has been applied to structural analysis (Choudhary et a/., 2016) and is the 
suggested starting point for combined uncertainties. 


For example a p-box may be used to represent an uncertain input which is defined by a distribution 
fitted to measured data via MLE. The MLE can return an uncertainty range for the mean and stan- 
dard deviation of the input and using this information the p-box can be calculated which envelopes 


64 of 109 


Volume 4 


4.2.4.13 


o 


Mathematical Methods for Propagating Uncertainty 


Discrete or interval 
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uncertain variables (described by PDFs) 
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produce a CDF of the output metric. 


The realisations are used to to 


N Jf provide a combined uncertainty. 
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CDF of output 


Figure 4.3: Example schematic showing propagation of probabilistic and dis- 
crete/interval uncertainties through a model. Each CDF on the left represents the un- 
certainty on an output metric for a given realisation of the epistemic/discrete uncertain 
inputs. These different realisations are combined to give the overall uncertainty on the 
output. 


all the normal distributions within these intervals. Likewise, if the distribution form is uncertain a p- 
box can be estimated from simple statistics of the data sample such as the mean, minimum value 
and maximum value. A p-box may also be calculated from an empirical CDF, for example using 
Kolmogorov-Smirnov bounds. Further details on how to compute p-boxes can be found in Tucker 
and Ferson (2003). 


Dempster-Shafer Theory 


Dempster-Shafer Theory (DST) is an alternative to the PBA method for analysing epistemic uncer- 
tainties. DST is a mathematical theory of evidence which allows one to base degrees of belief for 
one question on probabilities for a related question (Shafer, 1976). 


In DST a belief function is formulated and associated with a set of possible values which define the 
degree of belief that the value set will occur®. The belief in a value set may be a combination from 
multiple independent sources of evidence, formulated using a combination rule. A belief function 
is defined for all possible sets of values and the input uncertainties, as defined by the DST belief 


The degree of belief in a set is different from a traditional probabilistic assignment. For example, a belief in an outcome of 
zero does not imply it is impossible, as it would for a traditional probability, only that there is no reason to believe it did occur. 
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Figure 4.4: Example of two p-boxes using the same CDFs as Figure 2.3, where the 15 
normal distributions are each sampled 100 times. 


functions, can then be used to propagate epistemic uncertainties through an UQ calculation using 
direct calculation or surrogate models. DST is not typically used in the field of nuclear energy, but 
Darby (2018) demonstrates examples of it in the context of epistemic uncertainty in safety and 
security issues for nuclear weapons. 


Correlation Between Inputs 


It is possible that there is some correlation between uncertain inputs, and such correlations should 
be propagated through the UQ. For example, if a fluid composition is uncertain, this will affect 
its viscosity, thermal conductivity etc, so varying the composition and deriving the consequent 
common-cause variation in thermophysical properties (if this is known) is the most appropriate 
action, rather than allowing the properties to vary independently. 


Ideally, these uncertain inputs into the NTH model would be expressed as joint PDFs which de- 
scribe how sampling from one uncertain input influences others. A covariance matrix can be used 
to express the linear relationship between the variables. The correlation is calculated by normalis- 
ing the covariance by dividing by the standard deviation of the variables. This results in a dimen- 
sionless quantity between —1 and 1 which can be used to used to compare variables which are 
expressed in different units. 


When examining correlation in UQ it may also be necessary to consider: 


* Correlations that exist between the uncertain input and a feature of the output metric. For 
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example, when measuring the thermal conductivity of a material as a function of temperature, 
the error in the input (the average temperature of the sample) may be correlated with the 
error in the temperature difference across the sample, that is used to estimate the output 
(conductivity). 


¢ Whether it is the uncertainties on inputs which are correlated, rather than the values of the 
inputs themselves. 


In practice, it is likely that there is insufficient information to adequately define the probability distri- 
butions for correlated inputs (Roy and Oberkampf, 2011) and it may be necessary to treat them as 
independent variables in a practical engineering implementation. 


Furthermore, correlations between uncertain inputs can make propagation of them through the 
model more complex. For example, standard LHS in experimental design (discussed below) can 
be difficult when considering correlations between inputs, although specific methods have been 
developed to account for this (Sallaberry et a/., 2008). It is easiest to include correlations between 
uncertain inputs in simple Monte-Carlo methods of uncertainty propagation. 


Sampling or Design of Computer Experiments 


Computational experimental design provides the framework to statistically sample uncertain in- 
put parametrisations to capture the model response to them. Each statistical sample represents 
a single computer ‘experiment’ (simulation) to be performed. Effective experimental design opti- 
mises the number of model simulations that need to be performed (to keep the computational 
effort achievable) whilst ensuring that sufficient information is extracted to understand the effect of 
uncertainty in the input parameters on the overall model uncertainty. 


An experimental design needs to consider: 


« The dimensionality of the parameter space. For uncertainty or sensitivity analysis, this in- 
volves identifying the parameters that will be varied. 


* The range and shape of parameter space to cover in each dimension (the interval or PDF for 
each parameter). 


* Correlations between parameters. 


¢ The total number of computer experiments which can be performed. This is linked to the 
cost of each simulation in terms of time and resources (simulation run-times, number of 
processors, memory and storage needed). 


If the cost associated with individual evaluations of a model is so large that an insufficient number 
of calculations can be performed, then it indicates that using a surrogate model approach (Sec- 
tion 4.4) may be warranted, as long as the number of full-order simulations needed to construct the 
surrogate is relatively low. Experimental design can also be used to determine the computer model 
simulations required to adequately define the surrogate model. It is also the case that for costly 
models, sensitivity analysis cannot necessarily afford to be performed on a separate, specifically 
chosen set of model simulations, but must make use of the same set that are performed for the 
purposes of uncertainty quantification. 
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Much of the available experimental design literature is aimed at the design of physical experiments 
(as opposed to computer simulations). Whilst the two have much in common, computer simulations 
generally have the advantage that there is no notion of random error, i.e. if a computer simulation 
is run twice, it will (generally) produce the same result both times, unless there is a deliberate and 
controlled use of random numbers. Therefore, it is not necessary to perform repeat calculations 
with exactly the same input data. 


Some of the experimental design techniques described in this section are most often applied 
to physical experiments (e.g. factorial design, central composite design or Box-Behnken design) 
whereas others are more commonly applied to computer simulations (e.g. random sampling design 
or stratified sampling design). In some contexts, the description Design and Analysis of Computer 
Experiments (DACE) is used to distinguish computer experiments from physical DoE (Adams et al., 
2020, Chapter 4). 


Methods for performing experimental design can be broadly categorised as those based on select- 
ing from a structured grid or array of locations in the sampled space, and those that have a less 
structured spatial relationship. In each case, where k denotes the number of parameters being 
sampled, and therefore the number of dimensions of the sampled space. NIST (2021, Section 5.3) 
provides details of these methods, both in the context of experimental design for physical experi- 
ments and computer simulations, and examples of the methods discussed are shown in Figures 4.5 
and 4.6. 


Structured Grid Based Methods 


Factorial 


In factorial design, each parameter (or ‘factor’) is allowed to take the value of a specific level. For 
example, in a two-level factorial design each parameter might take the maximum and minimum 
of its uncertainty interval. If all combinations of input parameters are accounted for (‘full factorial 
design’), then the experiment would need to be run 2* times. Setting up the experiment in this way 
generates an orthogonal array, eliminating correlation between the estimates of the main effects 
and interactions. Whilst conceptually simple, this method of sampling becomes prohibitively costly 
if the uncertainty in a large number of factors is being investigated. 


One solution to the problem of the requirement for a large number of experimental runs is to 
perform a fractional factorial design, carefully choosing only a fraction of the runs specified by the 
full factorial design (for example 25 % or 50% of the total runs). This is done by only choosing the 
main effects and ignoring some interactions between parameters. NIST (2021) and Andres (2002) 
provide in-depth discussion of factorial design methods. 


All two-level fractional factorial designs are based on using only the maximum and minimum of each 
parameter’s uncertainty interval, and not any values between them. This means that a surrogate 
model fit to these samples may not adequately capture the effects of variation within these intervals. 
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Central Composite 


A Box-Wilson Central Composite Design (more commonly referred to as a central composite de- 
sign) consists of the following parts: 


1. A complete or fractional 2 factorial design (the factorial portion). 
2. Acentral point located at the median of each parameter uncertainty interval. 


3. A set of ‘star’ points that are the same as the central point, except for one factor that is varied 
between values above and below the median (typically outside of the two factorial levels in 
the factorial portion), so they lie on the co-ordinate axes of the parameter space. There are 
twice as many star points as there are parameters in the design (2k). 


Therefore, the total number of design points in a central composite design is n = 2k + 2k +1. 
These additional points (compared to the two level full factorial design) allow for the estimation of 
the second-order polynomial equation, whereas the two-level factorial design is limited to linear 
relationships between variables. More details are provided by NIST (2021). 


A simpler, related method is a centred parameter study, where the factorial points are not used, and 
samples are only chosen along the axes of the parameter space (like the star points). Choosing 
more than three values along these axes allows for greater resolution of the non-linearity of the 
response, but this method only captures univariate sensitivities of the outputs to parameters (i.e. 
no combined variable effects). 


Box-Behnken 


In the Box-Behnken Design, experiments are run at each of the edges of the parameter space and 
at the centre of the space. Three levels are therefore required for each parameter. 


The main advantage of the Box-Behnken design is that, compared to the full factorial design and 
the central composite design, fewer experimental runs can be required to capture the parameter 
space, depending on the number of parameters. However, unlike the full factorial and central com- 
posite designs, the sampled space is smaller and provides a poor quality prediction in the ‘corners’ 
of the design space (NIST, 2021). 


Smolyak Sparse Grids 


A related method for reducing the number of evaluations needed from a ‘full tensor product’ (all 
combinations of values of input variables, where each is sampled a specified number of times, s, 
giving s* samples on a k dimensional grid) are sparse grids (or Smolyak sparse grids, Gerstner 
and Griebel, 1998). They are applied, for example, in Dakota (Dalbey et al., 2020) to sample a 
structured reduced set of uncertain inputs from a complete k-dimensional grid for surrogate model 
UQ methods based on orthogonal polynomials (Section 4.4.2). 
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c: A two dimensional Smolyak sparse grid of level 2 (circles) sampling from normal distribution of standard 
deviation o for each parameter. Additional level 3 samples (crosses) retain the points from the level 2 grid 
and add additional samples. The levels contain 21 and 45 total samples respectively. 


Figure 4.5: Structured sampling methods. 


Unstructured Methods 


Random Sampling 


A random sampling experimental design is performed by drawing ‘Monte-Carlo’ samples from the 
parameterised distributions of the input variables. In a random sampling design, all of the parameter 
space has an equal probability of being sampled. The Monte-Carlo samples could be independent, 
but in more complex designs correlation between input variable sets can be accounted for using 
joint probability (or multivariate) distributions (Giunta et a/., 2003). 


A random sampling experimental design is the simplest to envisage and perform, and (given suf- 
ficient sampling) results in an unbiased estimate of the mean response. However, it may require 
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a large number of samples to adequately cover the extremes of the input distributions. This is 
of particular concern if the uncertainty quantification is concerned with large confidence intervals 
(e.g. 99.9%) or if some input parameter distributions have long ‘tails’ which are significant (Andres, 
2002). 


A random sampling experimental design may be applied to an appropriate surrogate model, to use 
the faster surrogate to rapidly perform the large number of evaluations. 


Stratified Sampling 


Stratified sampling design aims to ensure a more uniform sampling over the design space com- 
pared to random sampling design. LHS is a form of stratified sampling design that is used exten- 
sively (and often the terms are used interchangeably in the literature). In LHS, if n samples are 
requested, then the range for each parameter is divided into n ‘bins’ of equal probability. Then, n 
samples are randomly selected such that exactly one sample is randomly placed inside each bin 
for each parameter. 


The construction of the LHS design can give a more accurate characterisation of a response (i.e. 
narrower confidence intervals on quantities like the mean and standard deviation) than a simple 
random sampling design for an equivalent number of samples. However, it is also possible for 
conditions required by LHS to be satisfied whilst being poorly arranged in the sample space. For 
example, if all the LHS samples in Figure 4.6 were arranged along the diagonal, this would lead to 
poor coverage of the parameter space (Bhattacharyya, 2018). 


Variations of stratified sampling are available, such as orthogonal array sampling (Giunta et al, 
2003), as are generalisations of LHS (Shields and Zhang, 2016, as applied, for example, by Banyay 
et al., 2020). 


Low Discrepancy Methods 


Purely random sampling inevitably generates dense clusters of samples. Quasi-Monte Carlo meth- 
ods produce deterministically generated sequences of samples, that have the appearance of being 
random, but are intended to cover the sampling space with more uniformity (lower discrepancy). 
Halton and Hammersley sequences are examples of these methods. 


Another low discrepancy method is a centroidal Voronoi tessellation, where the sample points 
represent the centroids of surrounding space filling polygons. The positions of the samples are it- 
eratively adjusted to achieve an approximately uniform distribution of polygon shapes and volumes. 
This is similar to the operation of some automated tetrahedral and polyhedral mesh generators'® 
used for CFD. 


10 Often mesh generators construct 3 dimensional Delaunay triangulations, which are the ‘dual’ of the Voronoi tessellation. 
Methods used for sampling operate in k dimensions, giving the higher dimensional equivalents of tetrahedra and polyhedra. 


71 of 109 


Volume 4 


Mathematical Methods for Propagating Uncertainty 


Volume 4 


Random (100 samples) Halton sequence (100 samples) 


‘ 3 
° 
Oo 2 fo) ® fe) °° oO BO . 
(e) fe) fe) fe) ° 
{e) ° ® (2) 
of ° 8 °° - | 08 . . o 8 
, ° % ose) 1G o ® ° . 
fe) Og Oo Oo ° fe) 
° 8 o ° | fe) 9 © ° 
° g 0 @ © PQ fo) 
fe) ie) (e) io) fo) fo} (o) 
0.6 5 ° ° 1 0.6 Fo ° 4, © 
6 fo) oO ~ © 5 ° 
° ° 
2] fe) Cg. «(0 ° ss fo) o © 
° 2 ° o °® ° 
fe) ° fo) 
0.4} . - 00 o 4 0.46 = ° o) | 
° re) ° 
2 ° Q fo) oF 2 @ are 
ome) 2 fo) . 
. fe) O a 2 : a 
(e) ° fo) is) 
[ ° J [ fe) 
0.2, 98 é 0.2+ 0 . S 5 
) ° ° e 9 o © 
° ° 6 "6 qa © 6° 
fe) ie) fo) 
6 eo oe ° ° fo) 
0 Ll L L 1 0 1 Ll Ho 1 O 
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 


{ 
ie) 
ie} 
e) 
eo} 
0.85 {e) ° 
e} 
e) 
- O° 
0.6 5 ° 
(2) 
O° 
oe) 
L oe) 
0.4 e 
LO 1 
r 4 ro) 
J 
0.27 fe) 
fe) ie} 
ie} 
ie) 
0 fal 1 1 ee L 1 1 L 1 
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 


Figure 4.6: Unstructured sampling methods. The clustering of points characteristic of 
random sampling is seen, and contrasted with the more even spacing of the Halton 
sequence. The bin width for the LHS example is marked, showing that there is only 
one sample in each bin for each parameter dimension, although the coverage of the 
parameter space is not as uniform as achieved by the centroidal Voronoi method. 


4.3.3 Selecting a Method 


The appropriate sampling or DoE method to choose depends on the number of parameters being 
sampled, the range and shape of their variation, the cost and behaviour of the model that is being 
used to evaluate them, and the objective of the calculation: 


« For an initial sensitivity assessment, a centred parameter study or central composite design 
may be useful to screen the relative effects of a larger number of parameters and establish 
the expected range of outputs. A centred parameter study with a larger number of samples 
for each parameter can identify finer structure and non-linearity in the results. 


« Random number generators can easily draw samples in proportion to the probability density 
from any form of input distribution (they do not need to uniformly sample the parameters). 
However, because random or LHS sampling methods draw their samples from PDFs of their 
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uncertain parameter in a known way, the inputs are linked to the probability of the outputs in 
a way that permits statistical rigour in their interpretation, which may be necessary in some 
UQ applications. 


¢ Low discrepancy methods provide more uniform coverage of the parameter space, but do 
not possess these more precise statistical properties. For samples being used to construct 
surrogate models, however, it is more important that the number of experiments performed 
and range of parameters used are sufficient to avoid significant distortion of the distribution 
of output metrics produced by the surrogate. For example, a response surface fitted to the 
samples from an experimental design with insufficient coverage of the parameter space may 
lead to an inappropriate functional form being chosen (e.g. a quadratic relationship may be 
fitted when in fact a higher order polynomial would have been more appropriate). This may 
introduce systematic uncertainty that is not accounted for in the calculated output metric 
uncertainty distribution. For a response surface application adequate samples are needed 
within the central region of parameter space, in which case, the Central Composite and Box- 
Behnken methods are likely to perform poorly. 


¢ Building particular kinds of surrogate model may benefit from the structured arrangement 
and known resolution in each direction offered by sparse grid methods. 


Guidance on which method to choose is difficult to formulate, other than to recommend that several 
options are assessed, that the coverage of parameter space is considered, and that the charac- 
teristics of the model response are evaluated carefully at each stage to choose where and how to 
concentrate attention. 


Surrogate Models 


The widespread availability of computing power has led to complex, high-fidelity numerical simula- 
tions (such as CFD) of physical phenomena being widely developed and applied. However, many 
situations during the design and licensing of a nuclear reactor motivate exploration and optimisa- 
tion across a large parameter space. In this situation, high-fidelity computer simulations are time 
consuming and costly. For example, complex coupled multiphysics simulations (including struc- 
tural or neutronic models for examples) may take many days to evaluate (potentially using hun- 
dreds or thousands of CPU cores) and involve solving equations for millions of degrees of freedom 
(grid/mesh points). 


In some situations it is beneficial to develop simpler models which retain a sufficiently accurate 
representation of the important underlying physical phenomena. These simpler models are often 
referred to as surrogate models, response surface models, metamodels, fully equivalent opera- 
tional models or emulators. Surrogate models have applications beyond SA and UQ; because they 
are fast to evaluate, they can also be used where performing computationally intensive simulations 
is impractical, such as design optimisation or real-time simulation. An overview of the process of 
using a complex model of reality to build a simplified surrogate model, and their typical applications, 
is shown in Figure 4.7. 


Surrogate models can be broadly classed into three types: 


* Simplified physical models: a lower fidelity physical approximation of the high-fidelity model. 
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Figure 4.7: Overview of the use of surrogate models. 


* Data-fit models: a response surface or emulator fitted to a sampled dataset from (ideally a 
small number of) high-fidelity model simulations. 


« Reduced Order Models (ROMs): the high-fidelity model is projected onto a basis of reduced 
dimensionality compared to the original solution space. 


Surrogate models use experimental design methods in two ways — it is necessary to choose the 
points in the input parameter space for the full-order model evaluations needed to build a surrogate 
model. Once the surrogate has been built and validated, it can be used to produce ‘emulated’ sim- 
ulation results, generated using a different method of choosing samples, to provide the functionality 
of interest. 


4.4.1 Requirements of a Surrogate Model 


A surrogate model is, by definition, a simplification of the full NTH model of the physical system 
of interest. It cannot, therefore, capture all of the physics of the high-fidelity model. This can be 
expressed mathematically as 

y=f(x)=f'(x)+6 


where f(x) is the high-fidelity thermal hydraulics model, f’(x) is the surrogate model and 4 is the 
approximation error associated with the surrogate model. Despite the modelling simplifications, a 
surrogate model can be a useful tool for several purposes, including for uncertainty quantification, 
if it meets certain requirements: 


* The approximation error relative to the full-order model should be known (or bounded), and 
acceptably small in the parameter ranges of interest, and for the required purpose. For ex- 
ample, deriving accurate sensitivity information from a surrogate model may require there to 
be smaller errors than for other uses. 


¢ The direction and approximate magnitude of the response of the surrogate model to changes 
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in the input variables must be consistent with underlying physics of the system". 
* It must calculate the output metrics required for the model application. 


* It must be sufficiently computationally efficient for the surrogate model application. 


To demonstrate that the surrogate model fulfils these requirements, it is important to compare its 
output against the full-order model. For example, the surrogate model could be fitted to a ‘training 
dataset’. The model performance should then be demonstrated on a different ‘test dataset’, not 
used in its construction, within the parameter space of interest to demonstrate whether approxi- 
mation error is suitably small. It is also possible to use cross validation, which involves repeatedly 
fitting and testing the surrogate model against different subsets of the available dataset, to assess 
its accuracy and to check for bias. 


The error in the surrogate model can be determined by examining the statistics output in the model 
fitting process (for example, R?, the RMS error, the mean absolute error or the maximum absolute 
error) and should be negligible compared to the uncertainties being propagated. 


Types of Surrogate Model 


Simplified Physical Model 


One option for a surrogate model is to use a lower fidelity thermal hydraulics code that captures the 
important physics of the full-order model, whilst being less computationally expensive. For example, 
the full-order model could be a detailed CFD model, however a porous model or other coarse 
grid representation (see Volume 2, Sections 3.4.3.3 and 4.2) may offer substantial computational 
savings while retaining much of the behaviour of importance. Through the use of a ‘test dataset’, the 
simplified representation can be compared to the detailed CFD analysis to establish that it captures 
the relevant physical phenomena and allows the output parameter of interest to be calculated 
accurately with only a small approximation error. 


A simplified physical model is likely to give the most accurate replication of the full-order model 
when extrapolating outside of the dataset used to derive or validate the surrogate model, because 
it is physically based. However, it is also likely that it will be more computationally expensive than 
some of the other techniques discussed below and may be more onerous to develop. 


A simplified physical model may also be included as a part of a full-order model to reduce com- 
putational expense. For example, a porous representation may be included in regions of a CFD 
model to reduce computational cost for experimental calculations. A surrogate model, using one 
of the techniques outlined below, may then be built from this to perform more detailed uncertainty 
quantification. 


For example, in a system comprising an ideal gas, where P/p = RT, then if T increases, but the surrogate does not 
produce a physically compatible change in P or gp, it is invalid. In this simple example, the expected behaviour is easy to 
predict and assess — for complex, realistic models and more subtle effects, it may not be. 
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Correlation-based Model 


Correlations are the simplest form of a simplified physical model. They synthesise a particular 
idealised set of behaviour to a (usually) analytical expression linking non-dimensional numbers 
representing the flow or heat transfer regime to responses of interest (e.g. pressure drop and heat 
transfer). They are often based on experimental data and can have an underlying physical basis to 
their algebraic relationship (although this is not a requirement) and are likely to have coefficients 
derived via regression methods (Section 4.2.4). 


Correlations are key to all aspects of system codes (a system code model for a complex two-phase 
flow accident could have hundreds of correlations embedded within it) and they are often used to 
provide boundary conditions for CFD models. 


Volume 2, Section 2.1.2.3 discusses heat transfer correlations in more detail, and Volume 5, Sec- 
tion 3.1 provides guidance specific to liquid metals, but also raises a point about a source of un- 
certainty that may be easy to overlook. Correlations are often intended to create a relationship 
suitable for an ‘ideal’ representation of a particular geometrical configuration and flow (such as a 
tube bundle). However, any given experiment has systematic biases in operation. The variation 
seen in results for the same basic geometry highlights this, and correlations that combine all of 
the available data gain universality from doing so, but their specificity for reproducing an individual 
dataset is reduced (although this is hard to separate from the contributions made by measurement 
uncertainties). 


Polynomial Response Surface Model 


A response surface model maps input variables to the model outputs. Typically a full-order model is 
run to develop a dataset of predicted output metrics for a chosen set of input (or predictor/regressor) 
variables chosen by an appropriate experimental design method (Section 4.3); a response surface 
is then fitted to this data using standard regression techniques. The response surface is typically a 
first-order or second-order polynomial: 


ke kk 
y =Bo+> Bixit > > Bixixi +6, 
j=l 


j=l i=1 


where y represents the response, x; represents the input variables, 6; are the regression coeffi- 
cients, k is the number of input variables and 6 is the error. Regression is used to determine the 
‘best-fit’ coefficients, with the aim of minimising the response surface error, 6. 


The response surface can then be interrogated for different input values to understand the impact 
on the required output metrics. Care should be taken near to the boundaries and particularly when 
extrapolating outside of the input range of the dataset to which the response surface has been 
fitted. It is likely that the response surface error will increase significantly outside of this dataset. 
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Polynomial Chaos Expansion Model 


In a PCE surrogate model the output is expanded into a series of orthogonal polynomials, which 
are a function of the uncertain inputs x: 


Pp 
yx fi(x)= y CnPn(x) 
n=0 

where ¢, are the (multivariate) polynomials of order, p, and c, are the expansion coefficients. The 
number of terms in the expansion is P + 1 = (k + p)!/k!p!, where & is the number of uncertain 
parameters'*. The order of the polynomials, p, is a user choice — it requires that the sampling 
of uncertain variables in each direction is sufficient is to provide enough points. When the PCE 
has converged, higher order coefficients should decay to zero and increasing p further causes 
no further change to the results. An illustration of how the PCE maps input variable uncertainty 
distributions to output variable distributions is shown in Figure 4.8. 


y =0.5 + 0.2 He, (x) + 0.02 He,(x) 


= 


° 
fo) 


Output from PCE surrogate, y 
i=) i=) 
aN for) 


° 
io 


2 15. 1 0.5 0 -3 -2 -1 0 1 2 3 
Probability of output, p(y) 


° 9 
— So iy o 
a io a w 


Probability of input, p(x) 


° 
a 


3 2 + OF ft 2 8 
Uncertain input, x 
Figure 4.8: Example of a PCE, where a standard normal distribution of a single variable, 


x, is mapped through 0, 1 and 2 order Hermite polynomials to produce the PDF of an 
output y. Adapted from the example given by Debusschere (2017). 


The first step of PCE defines the orthogonal polynomials ¢,. The form of the orthogonal polyno- 
mials is linked to the form of the PDF of the uncertainty parameter of interest. For example, if the 


12 This expression is for the ‘total-order expansion’ method of creating a PCE, where each parameter uses the same polyno- 
mial order, p. Other options are available that can reduce the number of terms, including allowing each parameter to select 
its optimal order (Dalbey et al., 2020). 
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uncertainty in a variable were normally distributed, the orthogonal polynomials will be Hermite poly- 
nomials; for a uniformly distributed input, they will be Legendre polynomials (Dalbey et al., 2020). 
In practice, software toolboxes can choose and generate these orthogonal polynomials internally. 


The next step estimates the expansion coefficients c,. These will often be derived numerically, 
either: 


¢ Using regression to minimise the difference between the value predicted by the polynomial 
expansion and the full-order model output results, corresponding to input samples chosen 
by a suitable experimental design method (such as LHS), or 


« Using a ‘spectral projection’ method, exploiting orthogonality in the sampling (or sparse grid 
sampling) of the parameter space. 


Having the expansion coefficients results in an analytical expression, from which the statistical 
moments defining the uncertainty in the output (such as the mean and variance) can be calcu- 
lated directly. In addition, the polynomial chaos expansion, once derived, can be interrogated like 
a response surface. PCE is an attractive surrogate model method because it requires a relatively 
small number of full-order model samples to generate. However, its implementation is not straight- 
forward'? and the number of polynomials involved can become large, so it is recommended that 
a software toolkit such as Dakota'* or Uncertainpy (Tennge et al., 2018) is used to exploit this 
method. 


PCE can be much faster (by requiring fewer full-model evaluations) than other uncertainty quan- 
tification methods, as long as the number of uncertain parameters is relatively low and the inputs 
are uncorrelated (Roy and Oberkampf, 2011). However, as the number of uncertain parameters 
increases, so does the number of full-order model evaluations required to achieve the same con- 
vergence. 


Kriging model 


Kriging based surrogate models, also Known as Gaussian process regression, are a form of re- 
sponse surface. As with other types of surrogate, a Kriging model can be interrogated many times 
to propagate the uncertainty in model parameters through to model outputs. 


A Kriging model is formed of two parts: 


* A regression of the samples from the full-order model. 


* The Gaussian process term. 


The Gaussian process term takes into account the correlation between samples to estimate the 
uncertainty in the regression. Kriging methods derive this uncertainty on the basis that samples 
taken close together in parameter space will be highly correlated whereas those taken further 
apart are less likely to be highly correlated. Therefore, a Kriging model can provide not only a 
prediction of the model response, but also the uncertainty in the prediction as a result of using the 
surrogate model. 

'3 It is noted that PCE can be implemented both intrusively (through direct modifications to the source code) and non- 
intrusively (Najm, 2009). The focus here is on non-intrusive methods, because it is assumed that an analyst will either 


not have access to the source code of the full-order model, or sufficient time or knowledge to modify it. 
14 dakota.sandia.gov 
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This is illustrated in Figure 4.9. It can be seen that close to the training data, the uncertainty in the 
prediction is small, but this uncertainty grows as the surrogate model is sampled further from the 
training dataset. 


Regression is used to find a function (line) 
that represents a set of data points as closely 


as possible 
y=0 
A Gaussian process is a probabilistic 
method that gives a confidence 
for the predicted function 
y=0 


Additional datapoints increase 
the specificity of the fit and reduce 
uncertainty near to them 


Figure 4.9: Example of a Kriging model, showing the uncertainty bounds, and that they 
reduce close to the sampling points used to construct it. Adapted from Gértler ef al. 
(2019). 


Kriging interpolation techniques were originally developed for use in the field of geostatistics, where 
geological samples are taken at irregularly spaced intervals and Kriging interpolation is used to 
create underground maps based on this information. They are also widely used in machine learning 
(Rasmussen and Williams, 2006) as well as in nuclear applications (Banyay et a/., 2019). Kriging 
methods have the advantage that they can accommodate irregularly spaced data and capture 
‘peaks’ and ‘troughs’ in the full-order model, thereby allowing for models with more complicated 
structure in responses. 


It is noted that Kriging models can become unstable as the number of sample points from the full- 
order model increases. This is because the correlation matrix that forms part of their description 
can become ill-conditioned as the distance between sample points decreases (Giunta ef al., 2006). 
Therefore they are best suited to sparse sample sets. 


79 of 109 


4.4.2.6 


4.5 


Mathematical Methods for Propagating Uncertainty 


Model Order Reduction 


ROMs take the full-order, high-fidelity model and project this onto a lower-dimensional basis which 
is more efficient and faster to solve. Reduced order methods have the advantage that they retain 
the underlying structure of the model, however they can be more intrusive than some of the other 
surrogate model methods discussed in this section. This means that the full-order model cannot 
necessarily be treated as a ‘black box’, and requires knowledge of the underlying model form in 
order to derive the reduced order model. 


Volume 4 


POD, balanced truncation and Krylov subspace methods are examples of methods used for Projection- 


based ROMs (Benner ef a/., 2015). POD (also known as the Karhunen-Loéve method in the field of 
stochastic theory, or principal component analysis in the field of statistical analysis) is a snapshot- 
based ROM method that has been applied to CFD problems (Section 3.4.4). 


In POD, a ‘snapshot matrix’ is generated, which is a collection of observations of the quantities 
of interest (such as the velocity or temperature fields) at either a series of instants in time, or as 
separate realisations'® of a flow, but with a parametric variation (in flow rate, for example). From this 
snapshot matrix, a set of empirical eigenfunctions are derived which form the basis of the ROM. 
These eigenfunctions identify the most energetic modes of the full-order model. The ROM is then 
obtained from the Galerkin projection of the governing equations onto this POD basis. Georgaka 
et al. (2020) provides a more detailed description of the derivation of a ROM via POD. 


Sensitivity Analysis 


Sensitivity analysis is used to determine which model inputs have an important effect on the output 
metrics. Although similar to uncertainty quantification, the focus of sensitivity analysis is on iden- 
tifying the important variables and relationships in a model, rather than in numerically quantifying 
and propagating uncertainty through to the model output. Local sensitivity analysis involves study- 
ing the relative influence of different parameters at a given point in the input space. On the other 
hand, global sensitivity analysis examines the effect of one parameter whilst other parameters are 
also varied, thus allowing the effect of interactions between variables to be accounted for. In addi- 
tion, the analysis is not sensitive to a chosen nominal point in input space (Kucherenko and looss, 
2017). Global sensitivity analysis techniques are the focus of the remainder of this section. 


Sensitivity analysis usually involves the following steps: 


Identify the model responses of interest (the key safety or performance metrics). 


Identify potentially important input parameters. 


Propose plausible bounds on these input parameters for the sensitivity analysis. 


Use experimental design techniques (Section 4.3) to construct a dataset on which to perform 
the sensitivity analysis. 


Execute the full-order (or surrogate model) model evaluations for the chosen dataset. 


Perform sensitivity analysis by calculating or interpreting the measures and metrics dis- 
cussed below. Many software packages have built-in functionality that can assist with this. 


15 Chosen using the techniques described in Section 4.3. 
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Scatter Plots 


While not a ‘measure’ of sensitivity per-se, scatter plots provide a useful first step to examine the 
impact of different model inputs on the model outputs of interest, allowing a visual identification of 
trends and structure in the variation of model output with variations in input. 


Sensitivity Coefficients 


The Jacobian is the first order gradient (linear sensitivity coefficient) of the variation of each of m 
model outputs, f, in response to n model inputs, x: 


ge. 3 
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Where the mathematical form of the model equations is suitable and sufficiently simple, then an- 
alytical methods are available such that the sensitivity and uncertainty in them can be derived 
without needing to apply numerical and statistical methods. In this case, and where there is only 
one output variable, a Taylor expansion can be used to determine the sensitivity at higher orders, 
and then be used to analytically propagate uncertainty. For mathematically analysable models 
with multiple input and outputs, partially differentiating the expressions provides the Jacobian alge- 
braically, which can then be evaluated at any point in the parameter space. However, the Jacobian 
is typically estimated using numerical methods, because the model used is too complex to assess 
analytically. A particular numerical Jacobian only applies at one point in parameter space (such as 
the centre of the input distributions, so is a local sensitivity). 


It may be beneficial to deliberately frame the complexity and mathematical representation of a 
system to facilitate the efficient application of analytical methods, including more sophisticated 
techniques such as adjoint methods (Cacuci, 2003). 


Correlation Coefficients 


Correlation coefficients are a method of quantifying the statistical relationship between two param- 
eters or two sets of data. There are two commonly used correlation coefficients (Hofer, 2018): 


The simple correlation coefficient (also known as the Pearson correlation coefficient) measures 
the strength and direction of a /inear relationship between two variables. The coefficient can 
vary between -1 and 1, with a correlation of 1 indicating a perfect, positive correlation and -1 
indicating a perfect, negative correlation. If there is no or little linear correlation, the coefficient 
will be close to 0. 


The rank correlation coefficient (also known as the Spearman coefficient) performs correlations 
on ranked data giving a measure of the monotonicity of the relationship. In order to obtain 
ranked data, it is ordered such that the smallest value in a set of samples is given the rank 
1, the next smallest the rank 2 and so on. Rank correlation coefficients can be used if there 
is a non-linear relationship between the two variables. 
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Each type can have a ‘partial’ variant 


Partial correlation coefficients are similar to the simple and rank correlation coefficient, but allow 
for the correlation between two variables to be determined whilst controlling for the effects of 
the other variables present in the problem. 


It is also possible to determine the statistical significance of the correlation. This can be expressed 
as a p-value (between 0 and 1), which is often returned as an additional output with the correlation 
coefficient when using a statistical toolbox. 


Sobol Indices 


Whilst correlation coefficients test for a linear or monotonic relationship, Sobol indices (also known 
as variance-based decomposition) provide a measure of how much the output variance is at- 
tributable to each input variable. 


There are two principal sensitivity indices. The first is the main effect sensitivity index, $; (Saltelli 
et al., 2007, Hofer, 2018): 

__ Var (E (f|x;)) 

‘Var (Ff) 
where E is the ‘expected’ value (the mean) and Var is the variance. The numerator'® is the output 
variance of the main effect of x;, attributable to that parameter only. The denominator is the total 
variance in the output. Higher order terms can also be calculated, known as interaction terms. It is 
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required that 


A large main effect index indicates that the input parameter in question has an important effect on 
the output of interest. 


The second metric is the total effect index T;: 


he E (Var (f|x_;)) = Var (f) — Var (E (f|x_;)) 
; Var (f) Var (f) 


This measures the contribution to the output variance of x;, including the effects of its interactions 
with other variables'’, and has the advantage that it does not need to calculate the interaction 
terms (Sj; and S;;,) separately. A larger total effect index for a given parameter compared to S; 
indicates that this parameter interacts with other parameters, and the sum of total effect indices is 
greater than or equal to one (it only equals one when there are no interactions, Glen and Isaacs, 
2012). 


Although analytically calculating the Sobol indices is possible, the number of sensitivity indices 
quickly becomes unmanageable as the number of inputs increases. This means that usually the 
sensitivity indices are solved for numerically. One exception to this is if a PCE surrogate model has 
been created — the properties of the PCE mean that the Sobol indices can be calculated analytically 
(Sudret, 2008). 


16 The vertical line in E (f|x;) denotes the ‘conditional expectation’, e.g. the mean of the function f, as it depends on x;. 
17 (f|x_;) denotes being conditional on all inputs except x;. 
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Uncertainty Quantification by Propagation 


Quantification of the effect of uncertain model parameters and of initial and boundary conditions is 
a common application for UQ. This takes the uncertain parameters, characterised into probability 
distributions using the methods described above, and propagates them through a modelling tool 
to predict the distribution of the model outputs. This section covers the propagation and UQ where 
all input variables are characterised by PDFs. How to combine these methods with other forms of 
uncertainty is discussed in Section 4.8. 


Sampling approaches are the simplest methods of parametric uncertainty propagation, so are 
usually the first step in any UQ process: 


Sample from the assumed distribution for each input parameter: LHS should be the default 
choice of sampling in most cases. It is important to have a sufficient sample size to ensure 
that there is adequate sampling from all areas of a parameter’s probability distribution; Swiler 
and Giunta (2007) suggest that the number of samples should be at least 10 times the 
number of uncertain variables. 


Propagate the sampled values through the model: Evaluate the simulation results for each set 
of sampled values, treating the numerical model as a ‘black box’. Because all samples are 
independent and known at the outset of the calculation, evaluations can be carried out in 
parallel to speed up the analysis. 


Interpret the model output: For each output metric, the evaluation results can be plotted as a 
histogram to assess the statistics and distribution of the predicted values. The median of 
this distribution represents the best estimate value and its width (calculated as the variance, 
o*) represents the uncertainty. The cumulative sum of the histogram provides an estimate 
of the CDF, which can be used, for example, to determine the likelinood of output metric 
exceeding a limit value. Taking more samples will generally allow an output distribution to be 
smoother, its median and variance to be known with more certainty (this is discussed further 
in Section 4.7), and the lower probability ends of distribution to be better resolved. 


Guidance on what tools can be used to help perform these steps, and suggested visualisation 
methods suitable for interpreting the outcome are discussed in Section 2.2. 


Use of Surrogate Models 


If the model to be sampled is fast enough to evaluate, then it can be used directly to gather enough 
data to provide the necessary statistics and output distribution. However, if the model is resource 
intensive, such that each sample evaluation requires extensive computation, it may not be practical 
to apply sampling UQ to it directly. 


In this case, surrogate models (Section 4.4) can be used. They can be used for purely computa- 
tional efficiency reasons, where a relatively small number (O(107)) of sampled evaluations of the 
‘full-order’ model can be used to construct the surrogate, which can itself be sampled and rapidly 
evaluated a much larger number of times (O(10*) to O(10°)) to provide the output statistics and 
distribution. They can also be employed as an efficient way to provide more refined and additional 
information about the results. 
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There are a number of considerations when applying a surrogate for UQ: 


« Giunta et a/. (2006) discuss whether, when and why to use a surrogate model (referring to 
them as trend models): 
... this study attempts to answer the following question, ‘Given a set of N simu- 
lation output data samples, is it more accurate to estimate UQ statistics directly 
from the N data points, or is it more accurate to extract trend information from the 
N data points, and then estimate UQ statistics from the trend models?’ 
Although it only deals with a simple problem with a small number of uncertain inputs, the de- 
cision about using a surrogate model becomes implementation and case specific, so gener- 
alised guidance is not possible. They conclude that caution is needed, and suggest applying 
several surrogate model approaches, checking that they provide a similar result, checking 
for a strong dependence on WN, and comparing between a surrogate method and a purely 
sampled result to select a suitable technique. They also found that for small NV, there was not 
necessarily a benefit obtained by using a surrogate. 


« It should be remembered that using a simplified model to propagate the uncertainties intro- 
duces its own contribution to the uncertainty, and this should be able to be estimated and 
shown to be sufficiently small. Kriging models can provide this uncertainty estimate as part 
of their construction 


« Fitting Kriging models can become unreliable with larger numbers of samples (Giunta et al., 
2006), so a higher WN is not necessarily better. 


« PCE needs the model response to be smooth, but has the significant benefit that it can 
provide Sobol indices and the mean and variance of the response directly, without sampling 
them further. They can, however, also be rapidly sampled to provide a well resolved CDF. 


Reliability Methods 


In some situations, the mean behaviour of a system is not of interest, only the margin to, or be- 
haviour near to a limit. In a structural integrity assessment, this is known as ‘reliability’, and is 
concerned with the approach to a ‘cliff-edge’ failure condition. Therefore, understanding the whole 
span of the CDF of a response with equal resolution and accuracy is not desired. Uncertainty prop- 
agation via reliability methods focuses on finding and characterising the parameter space near to 
the failure limit, and evaluating the likelinood of exceeding it based on the uncertainty in the input 
parameters. 


Traditional First Order Reliability Method (FORM) and Second Order Reliability Method (SORM) 
approaches approximate a linear or quadratic failure surface in parameter space, and determine 
the probability of being on either side of it to be assessed. There are several more advanced vari- 
ants available to find the failure surfaces, that are similar to optimisation searches — they adaptively 
select new evaluation points to resolve the failure region using ‘importance sampling’ (MacKay, 
2003). Reliability methods are therefore often more efficient at resolving the ‘tails’ of input distri- 
butions for low probability conditions than sampling methods. These methods often make use of 
underlying Gaussian process (Kriging) surrogate models in their optimisation evaluations. 


Further discussion and guidance on choosing an uncertainty quantification approach is available 
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from the documentation of tools used to perform it, such as Dakota (Adams et al., 2020, Chapter 
5). 


Assessment of Output Uncertainty from Propagation 


The aim of the uncertainty quantification process of propagating uncertain inputs defined by a 
probability distribution is to calculate the PDF or CDF of an output metric. This section outlines 
some of the principles useful for interpreting the probability distribution representing the uncertainty 
in the output metric. 


Confidence Intervals and Levels 


The principle of a Confidence Interval (Cl) is illustrated in Figure 4.10, where histograms of an 
output metric from simulation results visualises the effect of aleatory uncertainty in the inputs. A Cl 
represents the range of values within which the output metric of interest is potentially contained. A 
confidence interval is stated with specified confidence level (sometimes called the degree of confi- 
dence). The distinction between confidence and probability, and the correct statistical interpretation 
of Cls can be subtle and it is hard to express the details concisely (DeGroot and Schervish, 2012). 
Following the explanation of Navidi (2020), consider a population with a mean, py, from which a 
finite number of samples is taken, giving a sample mean, X. 95 times out 100, when taking such a 
random sample, the (true, but unknown) population mean will be expected to lie within the 95 % Cl 
of the sample mean. 


Two types of confidence interval exist — one-sided and two sided. Figure 4.10 demonstrates a 
two-sided confidence interval (a, b), which brackets the uncertainty above and below the central- 
estimate value. A one-sided confidence interval brackets all of the uncertainty distribution on one 
side of a value: (a, co) is the lower one-sided interval, and (—oo, b) is the upper one-sided interval. 
In the example demonstrated in Figure 4.10, the upper end of the 95% two-sided confidence 
interval is equivalent to the one-sided upper 97.5% confidence interval. One-sided intervals are 
used in failure assessments where only a lower (or upper) bound of a limit is of interest. A two- 
sided interval gives an indication of the spread of the uncertainty in a quantity. 


The Cl represents a trade-off — for a given distribution, narrower limits can be stated on the value 
of the central estimate, but they then must represent a lower confidence level. 


For a symmetric uncertainty distribution, the central-estimate (median or 50" percentile) is equiv- 
alent to the mean and mode. However, if the calculated uncertainty distribution is asymmetric then 
this no longer holds. The mode is the most likely occurring value, at the peak of the probability 
distribution, the mean is the average value. 


In a forward propagation of uncertainty calculation, the probability density of the output metric can 
be found by evaluating a set of Monte-Carlo samples (often using a surrogate model). The result of 
this approach is an empirical distribution of values of the output metric. The empirical distribution 
of Monte-Carlo samples represents the calculated uncertainty on the output metric. The median 
value and confidence interval can be found by either taking relevant percentiles of the empirical 
distribution (e.g. 50", 2.5 and 97.5" for a 95 % Cl) 
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Figure 4.10: Example of the 95 % Cl (a, b) of an output metric (represented by a normal 
distribution randomly sampled either 500 times (left) or 5000 times (right) to simulate 
a range of modelling results). Also indicated is how the percentiles of the sampled un- 
certainty distribution are also uncertain in their own right — the uncertainty in them that 
arises from the finite number of samples can be reduced with an increasing the number 
of evaluations (although this is not the only source of uncertainty). 


A concept that often occurs when introducing UQ to an analysis process is that of a ‘baseline’ 
model that may have previously served as a deterministic model, and serves as a starting point 
around which uncertainty analysis is built. If symmetric uncertainty distributions are placed around 
its inputs, and the process modelled has an approximately linear response, then the results of the 
baseline model will lie near to the central-estimate (or 50'" percentile) or the mode of the output 
uncertainty distribution. However, this does not always need to be the case. It may be that the 
chosen uncertainty distributions of the inputs do not have the baseline model at their centre when 
they are assessed, or they are not symmetric and the model is non-linear, such that the effect of 
the dominant uncertainty is to influence the outputs to be mainly in one direction. It is therefore 
not inherently expected or required that the baseline results are representative of the median, after 
applying UQ. 


Confidence Interval Uncertainty 


It should be noted that the confidence intervals derived from any uncertainty assessment calcula- 
tion will themselves contain uncertainty. This can originate from two sources: 


¢ The numerical calculations in the uncertainty assessment are not exact. This could be related 
to performing a limited number of model calculations, limitations in the experimental design 
or uncertainty in a surrogate model. In general this uncertainty can be quantified and should 
be considered when interpreting a quoted Cl and confidence level. It is more significant 
when considering an output metric of interest in the tails of the uncertainty distribution when 
compared to the mean (i.e. more samples will be required to determine the 99" percentile to 
the same level of accuracy as the mean). This is shown in Figure 4.10, where the uncertainty 
on different percentiles of an underlying distribution are indicated. 


« The formulation of the uncertainty assessment is not definitive. For example, approximations, 
assumptions and expert judgements made during derivation of the uncertain input distribu- 
tions, or source of uncertainty not considered in the analysis. In general, this uncertainty can 
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be difficult to quantify. 


These items need to be considered when interpreting the results of the uncertainty quantification 
in the context of overall nuclear safety. 


Statistical Tolerance Limits 


Where confidence limits are the values within which a given population parameter, such as the 
mean, is expected to lie, statistical tolerance limits are the values within which a stated proportion 
of the population is expected to lie (NIST, 2021). 


A significant advantage of this methodology is that no a priori reduction in the number of 
uncertain input parameters by expert judgement or screening calculations is necessary 
to limit the calculation effort. All potentially important parameters may be included in the 
uncertainty analysis. The method accounts for the combined influence of all identified 
input uncertainties on the results. This would be difficult or even impossible to achieve 
by a priori expert judgement of loss of coolant accidents or transients. The number of 
calculations needed is independent of the number of uncertain parameters accounted 
for in the analysis. It does, however, depend on the requested tolerance limits, that is, 
the requested probability coverage (percentile) of the combined effect of the quantified 
uncertainties, and on the requested confidence level of the code results. The tolerance 
limits can be used for quantitative statements about margins to acceptance criteria. 


Glaeser, 2008 


... the key point is about the definition and interpretation of a result given by computing (u%, v%) 
as Statistical tolerance limits (Hofer, 2018): 


* The two-sided (u%, v%) statistical tolerance limit contains at least u% of the population of 


possibly true answers to the assessment question at a confidence level of at least v%. 


* The upper [one-sided] (u%, v%) statistical tolerance limit says that at least u% of the popu- 
lation of possibly true answers to the assessment question do not exceed this limit value at 


a confidence level of at least v%. 


The confidence level (v%) is specified because the statements about the quantile covered (u%) are 
obtained from a random sample of limited size (illustrated in Figure 4.10). The attraction of these 
limits are that the number of simulations required to satisfy them can be determined in advance, 
and is independent of the number of uncertain variables. For example, VN = 59 model evaluations 
are needed for a one-sided (95%, 95%) tolerance limit, and N = 93 are needed for a two-sided 
(95%, 95%) tolerance limit (Glaeser, 2008). Reduced allowable confidence permits a lower number 
of evaluations, where N = 45 is needed for a one-sided (95%, 90%) tolerance limit. 


While this predictability and independence from the number of uncertain variables offers benefits, 
and does not require the application of surrogate models to the samples, the confidence limits are 
not automatically achieved simply by performing any selection of N’ samples. The validity of the 
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tolerance limits come with other requirements and limitations (Hofer, 1999, Hofer, 2018, Porter, 
2019), such as being non-tolerant of code run failures, requiring low model and numerical un- 
certainty and placing requirements on the quality and completeness of the definition of the input 
uncertainties. It is also restricted to using simple random sampling — methods such as LHS cannot 
be used. 


Chebyshev’s Rule 


Chebyshev’s rule states that for any probability distribution for which a mean and standard deviation 
can be derived (Vose, 2008): 


For any number k greater than 1, at least (1—1/k*) of the measurements will fall within 
k standard deviations of the mean. 


For example, for any probability distribution 75 % of the measurements must fall within 2 standard 


deviations of the mean (+1). Chebyshev’s rule is a more general version of the rule that for, a 
normal distribution, 68 %, 95% and 99.7 % of the measurements must fall within 2, 4 and 6 stan- 
dard deviations of the mean (also known as the 68-95-99.7 rule). By necessity, Chebyshev’s rule is 
conservative, and a more precise estimate can be made if the specific distribution form is Known. 
However, Chebyshev’s rule may be used to define: 


1. A conservative interval range when characterising an uncertain epistemic input. 


2. A confidence interval on an output metric when the precise distribution form is unknown or 
uncertain. 


Combining Uncertainty 


The propagation of uncertainty discussed in Section 4.6 is associated with input uncertainties that 
can largely be described by continuous probability distributions. However, as noted in Section 4.2, 
epistemic uncertainties may be represented by intervals and combined uncertainties via imprecise 
probability. This section discusses methods and techniques by which these and other forms of 
uncertainty can also be included in a UQ calculation. The propagation and combination of differ- 
ent sources of uncertainty, and the mathematical tools to reduce the computational cost of such 
activities, is an area of current research (for examaple, Schdbi and Sudret, 2017). 


Propagating Epistemic and Aleatory Uncertainties 


A common approach taken to propagate and combine epistemic and aleatory uncertainties is to 
use nested sampling. In an outer sampling loop a sample is taken for each of the epistemic un- 
certainties, giving a realisation of a single possible epistemic uncertain position. In the inner loop, 
for the given epistemic realisation, a probabilistic calculation of the aleatory uncertainties, as de- 
scribed in Section 4.6 is performed. The result of this nested sampling is an ensemble of aleatory 
distributions of output metric uncertainty, similar to that shown in Figure 4.3. 


The epistemic uncertainty may be represented either by a probability distribution (based on the 
belief of an analyst or other numerical evidence) or an interval. These different interpretations 
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of the epistemic uncertainties give rise to two common methods of implementing the outer loop 
described above. 


Second-order Monte-Carlo 


In second-order (or nested) Monte-Carlo calculations, the epistemic uncertainties in the outer- 
loop are represented by probability distributions similar to the inner loop described in Section 4.6. 
Often these distributions will be subjective in nature if there is only limited information available to 
describe the uncertainty. The output of this method is that each CDF generated has a probability of 
occurrence. The overall uncertainty distribution is the convolution of the CDFs taking into account 
their relative probability of occurrence. 


Probability Bounds Analysis 


The epistemic uncertainties alternatively can be represented by an interval, about which there is 
no knowledge of the structure of the uncertainty. In this case, the outer loop consists of taking 
samples from the epistemic uncertain intervals with no indication of the underlying probability. The 
output is similar to that for second-order Monte-Carlo, but the individual CDFs have no knowledge 
of their underlying likelinood, and it is appropriate to use a p-box representation of the output 
metric uncertainty using a PBA. The combined output metric uncertainty is defined by the upper 
and lower bound CDFs of the resulting p-box, so it is necessary to determine the combination of 
values of epistemic intervals that produce the widest output range. Defining these bounding cases 
is not trivial for a multi-dimensional problem and Roy and Oberkampf (2011) describe how it can 
be formulated as a constrained optimisation problem. 


The resulting p-box of the output metric gives a visualisation of the relative importance of the 
epistemic and aleatory uncertainty on the output metric and can be used to target future work to 
investigate the most significant specific areas of reducible epistemic uncertainty. For example, per- 
forming sensitivity studies investigating the impact of the epistemic uncertainty interval ranges the 
output metric, the p-box can be ‘pinched’ inwards to determine if further work to reduce uncertain- 
ties can have a meaningful impact on the overall output metric uncertainty. DST may also be used 
in a similar outer loop but instead of a single interval defining the epistemic uncertain input, the 
belief and plausibility functions are used instead. 


Model Uncertainty 


Uncertainty in a thermal hydraulic model (model uncertainty) is harder to quantify than uncertainty 
in input values. It is inevitable that model uncertainty will depend on a modelling decision or a 
subjective expert opinion, especially when a model is used to predict the consequences of a hy- 
pothesised fault where many phenomena interact, and that these epistemic uncertainties cannot 
be easily characterised and propagated in the UQ. This is especially true when there is little or no 
experimental data, so it is not possible to obtain validation evidence for the situation being modelled 
(such as a severe accident). 


The approaches discussed in this section may significantly increase the number of calculations 
required to quantify the uncertainty in an output metric. As a result, these methods may be best 
applied to system models (Section 3.3) which are less computationally expensive than more com- 
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plex CFD models. Furthermore, system code modelling approaches may require a greater degree 
of approximation and modelling judgement (introducing user effects) than a CFD model and hence 
the epistemic uncertainties on the model may be more important on the output metric and require 
greater consideration. 


In practice epistemic uncertainty in the model form may be significant and difficult to quantify in 
any meaningful way. The effort expended in characterising and propagating this depends on how 
important it is to understand its effects in context of the overall use of the model output. 


Model uncertainty may be formally incorporated into the UQ calculation by one of the following 
three methods (Tucker and Ferson, 2003): 


1. Try to re-express the model uncertainty as a probabilistic or epistemic uncertainty in the 
UQ calculation. This way it can be propagated through model as if it were another form 
of uncertainty, characterised using the methods described in Section 4.2 and propagated 
through the UQ using the methods outlined in Section 4.8.1. One possible approach is to 
incorporate into the underlying model a meta-parameter which represents an average over 
many possible different model uncertainties. The approach requires a subjective view of the 
probability associated with the parameter to represent model uncertainty and the physical 
reasonableness of this approach can be questioned. However, once implemented, the meth- 
ods described above can be used to quantify the impact on the output metric uncertainty. For 
example, this could be done by characterising model uncertainty via an epistemic interval 
meta-parameter in the outer sampling loop. As a less ideal, but potentially more practical op- 
tion, the meta-parameter could be assumed to be a continuous distribution and incorporated 
into the inner sampling loop. However, this may make it difficult to separate the contributions 
of epistemic and aleatory uncertainties in output metric. 


2. Work-through each modelling decision, assumption or approximation using a set of struc- 
tured sensitivity studies to determine the impact on the output metric of interest. In Section 
4.6 the model being used for propagation was treated as a ‘black box’, where the model input 
parameters were perturbed and the effect on the outputs were observed without any inter- 
rogation of the internal operations of model itself. In ‘white box’ modelling, the user ‘peeks 
inside the box’, focusing specifically on internal knowledge of the code to assess model un- 
certainty. However, this rapidly becomes difficult for any complex model and performing this 
analysis for any more than a few key cases is unlikely to be practicable. 


3. Try to bound or judge the range of possible outcomes from any credible model which could 
exist and use this to define the epistemic uncertainty on the model output. This uncertainty 
may then be appended to the p-box defined by a PBA propagation of uncertainty. An assess- 
ment of the magnitude of this uncertainty may be based on evidence from model validation 
and extrapolation analysis (Roy and Oberkampf, 2011). In practice, this may be difficult for 
complex models which contain a number of poorly understood phenomenon and the rela- 
tionship between the output metric of interest and available validation data is unclear. 


Finally, model uncertainty may also incorporate aspects of numerical uncertainty in the UQ cal- 
culation route. The most common sources to consider are likely to be (Mahadevan and Sarkar, 
2009): 


* Discretisation or convergence uncertainties in the underlying physical model. 
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« Errors in the UQ analysis, e.g. due to performing a limited number of Monte-Carlo samples 
and uncertainties associated with using a surrogate model (e.g. the surrogate model residu- 
als). 


These can most readily be included in the UQ by incorporating them in the outer Monte-Carlo 
sampling loop, or in a PBA by appending the uncertainties to the edges of the p-boxes derived 
from propagating the other epistemic and aleatory uncertainties. 


Impact on the Number of Model Calculations Required 


Both second-order Monte-Carlo and PBA can substantially increase the computational require- 
ments of the UQ. For example, Roy and Oberkampf (2011) recommend that a minimum of three 
LHS samples are taken for each epistemic uncertainty, in combination with all of the remaining epis- 
temic uncertainties. The impact of this recommendation is that the minimum number of samples 
required increases as the cube of the number of epistemic uncertainties. Given that this repre- 
sents the outer loop in the nested Monte-Carlo sampling, the aleatory uncertainties must also be 
propagated through the model for each combination of epistemic uncertainties. 


This issue is often known as the ‘curse of dimensionality’ where the number of samples required 
rapidly becomes computationally limiting, for example see Kreinovich et a/. (2006). The use of 
sampling methods and surrogate models discussed in Sections 4.3 and 4.4 can help mitigate this, 
but for more than a small number of epistemic uncertainties it can be a significant limitation. 


When embarking on a UQ calculation, consideration needs to be given to the timescales and com- 
putational requirements of the task compared to the scope that can be achieved using the methods 
described in this technical volume. This should include the topics outlined in Section 4.3 along with 
the number of epistemic uncertainties that need to be propagated through the calculation. 


Inter-model Comparison 


It is possible for multiple analysts or teams to develop diverse models of the same NTH phe- 
nomenon (diverse in terms of underlying code, modelling assumptions and/or solution method). 
These models will not give identical results, even when using the same underlying physical model, 
due to user uncertainty. As a result, combining the results from a number of such models can be 
used to assess the epistemic model uncertainty via the inter-model spread (this is used in disci- 
plines where the underlying physical models are not fully understood, such as climate modelling 
IPCC, 2014). 


The cost and resource requirements needed often make it impractical to develop multiple models 
in all circumstances, but for calculations with potentially large uncertainties and significant con- 
sequences, consideration should be given for developing a small number of separate and inde- 
pendent models to examine the epistemic uncertainties on model formulation and user effects. 
Examining the output of such models, along with an estimation of the uncertainty on the output 
metrics can be highly informative and build confidence in the overall conclusions of the modelling 
work, and this is the basis for international benchmarking activities, as discussed in Section 3. 
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Presenting Overall Uncertainty in an Output Metric 


To make best use of the efforts applied in performing UQ, the combined overall uncertainty in the 
outputs of a model needs to be summarised and presented to relevant stakeholders and consid- 
ered in the context of the application and importance of the predictions, especially if they relate to 
nuclear safety. The calculated uncertainty on an output metric could be used in several ways: 


« The result could be compared against a performance target (or safety limit), as shown in Fig- 
ure 1.2, to provide substantiation of acceptability of the operation of the system. For example, 
demonstrating robustly that fuel cladding remains intact during a specific fault scenario even 
when accounting for modelling uncertainties. For a nested Monte-Carlo approach, the output 
uncertainty will be a PDF or CDF from which the confidence level that performance target is 
reached can be calculated directly. For a PBA the output will be a p-box against which the 
performance target can be compared. 


The result could be used to set a performance target (or safety limit) for other aspects of 
the system, which can be used to ensure safe operation. For example, a calculation for the 
temperature of spent fuel and the uncertainty on this value could be used to constrain the 
timescales of certain fuel handling operations, or the uncertainty on time to fuel damage 
could be used to inform the required response time of automated plant protection C&l sys- 
tems. 


The result could be used to identify key sources of epistemic uncertainty which must be 
further reduced (through further experimental measurements, for example) in order to sub- 
stantiate future safe performance of a system. For example, a current safety limit may be- 
come increasingly important as plant degrades during operation. A UQ and SA prediction of 
the core age when the safety limit is reached could be used to define an experimental pro- 
gramme or modelling activities to reduce epistemic uncertainties and inform life-extension or 
to help plan decommissioning activities. 


Most importantly, consideration must be given to the nuclear safety consequences associated with 
a calculation. For some analyses, it will be desirable to demonstrate that there is additional margin 
above the calculated uncertainty, which implies that reaching the safety limit is essentially impossi- 
ble (there remains substantial margin between the upper limit of the calculated uncertainty and the 
safety limit). This could be required for assessments for which a failure might result in a significant 
level of radiological release. In other assessments, it may be more appropriate to demonstrate that, 
to a given confidence level, the probability of failure is below a certain threshold. 


It is also possible that there is uncertainty on a performance target which also needs consideration 
alongside uncertainty in the output metric. For example, a fuel damage temperature may be derived 
from experiments which contain their own uncertainty. Figure 4.11 demonstrates this scenario. 
Under such circumstances a variety of methods for interpreting the results of the UQ could be 
applied. For example, a conservative lower bound on the safety limit could be used; hypothesis 
testing could demonstrate that a percentile of the output metric distribution is less than a percentile 
on the performance target (to a given probability); or the Bhattacharyya distance (EMS, 2021) 
could be used to measure the level of overlap between the two distributions. 
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Figure 4.11: Example assessment of uncertainty in both an output metric and a safety limit. 


The needs of a UQ assessment with respect to these issues should be derived from the require- 
ments under which the modelling activity it supports is being commissioned, and can be structured 
within the organisational approaches discussed in Section 3. 
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6 Abbreviations 


AIC 
ASME 
BEMUSE 
BEPU 
BWR 
C&l 
CAD 
CASL 
CCDF 
CDF 
CFD 
CFL 
CHT 
Cl 
CLT 
CPU 
CSAU 
CSNI 
DACE 
DNS 
DoE 
DST 
EBR 
EMDAP 
FEA 
FORM 
FSI 
GCl 
Gen IV 
GRS 
IAEA 
IET 
IUQ 
IVP 
KDE 
LES 
LHS 
LOCA 
LWR 


Akaike Information Criterion 

American Society of Mechanical Engineers 

Best Estimate Methods Uncertainty and Sensitivity Evaluation 
Best Estimate Plus Uncertainty 

Boiling Water Reactor 

Control & Instrumentation 
Computer Aided Design 

Consortium for Advanced Simulation of Light Water Reactors 
Complementary Cumulative Distribution Function 
Cumulative Distribution Function 

Computational Fluid Dynamics 
Courant-Friedrichs Lewy 

Conjugate Heat Transfer 

Confidence Interval 

Central Limit Theorem 

Central Processing Unit 

Code Scaling, Applicability and Uncertainty 
Committee on the Safety of Nuclear Installations 
Design and Analysis of Computer Experiments 
Direct Numerical Simulation 
Design of Experiments 

Dempster-Shafer Theory 

Experimental Breeder Reactor 

Evaluation Methodology Development and Application Process 
Finite Element Analysis 

First Order Reliability Method 

Fluid-Structure Interaction 

Grid Convergence Index 

Generation IV 

Gesellschaft fur Anlagen- und Reaktorsicherheit 
International Atomic Energy Agency 

Integral Effect Test 

Inverse Uncertainty Quantification 
Interval-Valued Probability 

Kernel Density Estimation 

Large Eddy Simulation 

Latin Hypercube Sampling 

Loss-Of-Coolant Accident 

Light Water Reactor 
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M&S Modelling and Simulation 

ML Machine Learning 

MLE Maximum Likelihood Estimation 

MMS Method of Manufactured Solutions 

MSR Molten Salt Reactor 

NEA Nuclear Energy Agency 

NGNP Next Generation Nuclear plant 

NPP Nuclear Power Plant 

NTH Nuclear Thermal Hydraulics 

OECD Organisation for Economic Co-operation and Development 

OLS Ordinary Least Squares Regression 

ONR Office for Nuclear Regulation 

PBA Probability Bounds Analysis 

PCE Polynomial Chaos Expansion 

PCMM Predictive Capability Maturity Model 

PCT Peak Cladding Temperature 

PDE Partial Differential Equation 

PDF Probability Density Function 

PIRT Phenomena Identification and Ranking Table 

PMF Probability Mass Function 

POD Proper Orthogonal Decomposition 

PREMIUM = Post-BEMUSE Reflood Model Input Uncertainty Methods 

PSA Probabilistic Safety Analysis 

PWR Pressurised Water Reactor 

QA Quality Assurance 

QMU Quantification of Margins and Uncertainties 

RANS Reynolds-Averaged Navier-Stokes 

RMS Root Mean Square 

ROM Reduced Order Model 

SA Sensitivity Analysis 

SAPIUM Systematic APproach for Input Uncertainty quantification Methodology 

SET Separate Effect Test 

SFR Sodium-cooled Fast Reactor 

SORM Second Order Reliability Method 

SQA Software Quality Assurance 

SQE Software Quality Engineering 

SRQ System Response Quantity 

TRISO TRlstructural-|SOtropic 

UMAE Uncertainty Methodology based on Accuracy Extrapolation 

UMS Uncertainty Method Study 

UQ Uncertainty Quantification 

URANS Unsteady Reynolds-Averaged Navier-Stokes 

US NRC United States Nuclear Regulatory Commission 

V&V Verification and Validation 
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VHTR Very High Temperature Reactor 
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